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Preface 



In this book we present a collection of papers around the topic of Agent Commu- 
nication. The communication between agents has been one of the major topics 
of research in multi-agent systems. The current work can therefore build on a 
number of previous workshops, the proceedings of which have been published in 
earlier volumes in this series. The basis of this collection is the accepted sub- 
missions of the workshop on Agent Communication Languages which was held 
in conjunction with the AAMAS conference in July 2003 in Melbourne. The 
workshop received 15 submissions of which 12 were selected for publication in 
this volume. Although the number of submissions was less than expected for 
an important area like Agent Communication there is no reason to worry that 
this area does not get enough attention from the agent community. First of all, 
the 12 selected papers are all of high quality. The high acceptance rate is only 
due to this high quality and not to the necessity to select a certain number of 
papers. Besides the high-quality workshop papers, we noticed that many papers 
on Agent Communication found their way to the main conference. We decided 
therefore to invite a number of authors to revise and extend their papers from 
this conference and to combine them with the workshop papers. We believe that 
the current collection comprises a very good and quite complete overview of the 
state of the art in this area of research and gives a good indication of the topics 
that are of major interest at the moment. 

The papers can roughly be divided over the following four topics: 

— Fundamentals of agent communication 

— Agent communication and commitments 

— Communication within groups of agents 

— Dialogues 

Although the topics are of course not mutually exclusive they indicate some 
main directions of research. We therefore have arranged the papers in the book 
according to the topics indicated above. 

The first six papers focus on some fundamental issues in agent communi- 
cation. The paper of A. Jones and X. Parent explains how the semantics of 
messages can be given in terms of the institutional context in which they are 
sent. M. Rovatsos, M. Nickles and G. Weiss go one step further and pose the 
thesis that the interaction itself provides the meaning of the messages. The use 
of cognitive coherence theory is explored in the paper of P. Pasquier, N. An- 
drillon, B. Chaib-draa and M.-A. Labrie. This theory is used to explain why 
certain utterances are used and why some effects are achieved. In the paper of 
R. Kremer, R. Flores and C. La Fournie the performatives that are used in the 
messages are discussed and a hierarchy of performative types is proposed. The 
last two papers in this section deal with the verification of agent communica- 
tion. In the paper of M.-P. Huget and M. Wooldridge model checking is used as 
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a method to check the compliance of agent communication to some properties. 
U. Endriss, N. Maudet, F. Sadri and F. Toni propose a logical formalism to de- 
scribe communication protocols. The use of this formalism makes it possible to 
verify the communication protocols against some properties such as guaranteed 
termination, answers when you expect them, etc. 

The concept of “commitment” is used by a growing number of researchers in 
agent communication and therefore is given a separate section in this book. The 
first paper of this section is by N. Fornara and M. Colombetti and discusses how 
protocols can be specified when the ACL is based on a semantics of commit- 
ments. A logical model to describe the commitments themselves as a basis for 
agent communication is discussed in the paper of M. Verdicchio and M. Colom- 
betti. J. Bentahar, B. Moulin and B. Chaib-draa argue that commitments can 
be combined into a commitment and argument network to formalize agent com- 
munication. When commitments are used to model agent communication some 
issues arise in how to create and dissolve them. In the paper of A.U. Mallya, P. 
Yolum and M. Singh some of the issues around resolving commitments are dis- 
cussed. In the paper of A. Chopra and M. Singh especially some nonmonotonic 
properties of commitments are handled. 

A relatively new topic that arose at this year’s workshop is that of multi-party 
dialogues. Many issues come up in this setting that do not play a role in dialogues 
between only two agents. The main issues are discussed in the first two papers 
of this section. The paper of D. Traum focuses on the complete setting of the 
dialogues, including the focus of attention, etc. The second paper of F. Dignum 
and G. Vreeswijk discusses the issues from the dialogue perspective. The latter 
paper also gives a first attempt to create a test bed in which one can check the 
properties of multi-party dialogues. This is of particular interest because it will 
be hard to formally prove some of these properties given the complex settings 
and many parameters that play a role. 

In the papers of P. Busetta, M. Merzi, S. Rossi and F. Legras and of F. 
Legras and C. Tessier some practical applications and implications of multi- 
party dialogues are discussed. Finally, in the paper of J. Yen, X. Fan and R.A. 
Volz the importance of proactive communication in teamwork is discussed. 

The last section of the book is centered around the concept of dialogues 
in agent communication. The first two papers discuss some fundamental issues 
concerning dialogues while the other three papers describe some applications of 
dialogue theory in negotiation and resolving discrepancies. The paper of P.E. 
Dunne and P. McBurney handles some issues around the selection of optimal 
utterances within a dialogue. In the paper of S. Parsons, P. McBurney and M. 
Wooldridge the mechanics of the dialogues themselves are discussed. 

In the paper of R.J. Beun and R.M. van Eijk we see the application of 
dialogue games in resolving discrepancies between the ontologies of the agents. 
A topic that will certainly become more and more relevant in open agent systems! 

The paper of P. McBurney and S. Parsons describes how the idea of “posit 
spaces” can be exploited to describe protocols for negotiation between agents. 
In the final paper by I. Rahuian, L. Sonenberg and F. Dignum a first attempt is 
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made to describe how negotiation dialogues can be modeled using the interests 
of the agents as a basis. 

We want to conclude this preface by extending our thanks to the members 
of the program committee of the ACL workshop who were willing to review the 
papers in a very short time span, and also of course to the authors who were 
willing to submit their papers to our workshop and the authors who revised their 
papers for this book. 



October 2003 Frank Dignum 

Utrecht, The Netherlands 
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Abstract. This article aims to provide foundations for a new approach 
to Agent Communication Languages (ACLs). First, we outline the the- 
ory of signalling acts. In contrast to current approaches to communica- 
tion, this account is neither intention-based nor commitment-based, but 
convention-based. Next, we outline one way of embedding that theory 
within an account of conversation. We move here from an account of the 
basic types of communicative act (the statics of communication) to an 
account of their role in sequences of exchanges in communicative interac- 
tion (the dynamics of communication). Finally, we apply the framework 
to the analysis of a conversational protocol. 



1 Introduction 

Current approaches to conversation can be divided into two basic categories: 

— those that are intention-based or mentalistic. Inspired by Grice [14], these 
approaches focus on the effects communicative acts have on participants’ 
mental states (see e.g. [30,20]); 

— those that are commitment-based, in that they assign a key role to the notion 
of commitment (see e.g. [36,29,9]). 

What the relative merits are of intention-based and convention-based approaches 
to communication is a question that has been much debated within the Philos- 
ophy of Language [14,22,26,3]. We cannot here enter into the details of this 
debate. Suffice it to say that it has become increasingly clear that the role played 
by the Gricean recognition-of-intention mechanism is not as important as one 
might think. Indeed, as far as literal speech acts are concerned, it is necessary 
to assume such a mechanism only for those cases where communicative acts are 
performed in the absence of established conventional rules. On the other hand, 
as some researchers working on Agent Communication Language (ACL) have 
also observed, the intention-based account takes for granted a rather controver- 
sial assumption, according to which agents’ mental states are verifiable. This 
last observation is in fact one of the starting points of the commitment-based 
account as proposed by Singh [29] and Colombetti [9]. However, there are also 
some strong reasons to believe that that account too is fundamentally problem- 
atic. The most obvious reason has to do with the fact that it is not entirely clear 
what it means for speaker j to commit himself to an assertion of p. Should not 
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the propositional content of a commitment be a future act of the speaker? If so, 
to what action is j preparing to commit himself, when asserting pi A natural re- 
action is to say that, in asserting p, speaker j in fact commits himself to defend p 
if p is challenged by k. This is the view defended by Walton and Krabbe [36], and 
by Brandom [4,5]. However, in line with Levi [21], we believe that this defence 
does not stand up to close scrutiny. What counts as an assertion in a language- 
game may correlate very poorly with j’s beliefs. For instance, j can say that p 
without being able to defend p 1 . Does that mean that j is not making an asser- 
tion? If not, what is he doing? As we shall see, to focus exclusively on agents’ 
commitments amounts, ultimately, to confusing two kinds of norms, which have 
been called “preservative” and “constitutive” . The first are the kind that control 
antecedently existing activities, e.g. traffic regulation, while the second are the 
kind that create or constitute the activity itself, e.g. the rules of the game. 

Objections of these kinds, we believe, indicate the need for an account of sig- 
nalling acts based not on intentions , or commitments, but on public conventions. 

The paper is structured as follows. Section 2 outlines the basic assumptions 
and intuitions which motivate the theory of conventional signalling acts. Sec- 
tion 3 outlines one way of embedding that theory within an account of conversa- 
tion. We move here from an account of the basic types of communicative act (the 
statics of communication) to an account of their role in sequences of exchanges 
in communicative interaction (the dynamics of communication). The proposed 
framework is applied to the analysis of a conversational protocol. 

2 Conventional Signalling Acts 

The account of signalling acts outlined in this section bases the characterisa- 
tion of communicative action neither on the intentions of communicators, nor 
on their commitments, but rather on the publically accessible conventions the 
use of which makes possible the performance of meaningful signalling acts. Con- 
sideration, first, of the communicative act of asserting will serve as a means of 
presenting the basic assumptions and intuitions which guide this approach. 

2.1 Indicative Signalling Systems 

The term ‘indicative signalling system’ is here used to refer to a signalling sys- 
tem in which acts of asserting can be performed. Such systems are constituted 
by conventions which grant that the performance, in particular circumstances, 
of instances of a given class of act-types count as assertions, and which also 
specify what the assertions mean. For example, the utterance with a particular 
intonation pattern of a token of the sentence “The ship is carrying explosives” 
will count, in an ordinary communication situation, as an assertion that the ship 
is carrying explosives. The raising, on board the ship, of a specific sequence of 
flags, will also count as an assertion that the ship is carrying explosives. In the 

1 For instance, Levi gives the example of a teacher explaining a thesis to a group of 
students. 
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first case the signal takes the form of a linguistic utterance, and in the second 
it takes the form of an act of showing flags. These are just two of a number of 
different types of media employed in signalling systems. For present purposes, it 
is irrelevant which medium of communication is employed. But for both of these 
signalling systems there are conventions determining that particular acts count 
as assertions with particular meanings. 

According to Searle [26] , if the performance by agent j of a given communica- 
tive act counts as an assertion of the truth of A, then j’s performance counts as 
an undertaking to the effect that A is true. What lies behind that claim, surely, 
is that when j asserts that A what he says ought to be true, in some sense or 
other of ‘ought’. The problem is to specify what sense of ‘ought’ this is. (Cf. Ste- 
nius [31].) The view adopted here is that the relevant sense of ‘ought’ pertains 
to the specification of the conditions under which an indicative signalling system 
is in an optimal state: given that the prime function of an indicative signalling 
system is to facilitate the transmission of reliable information, the system is in 
a less than optimal state, relative to that function, when a false signal is trans- 
mitted. The relevant sense of ‘ought’ is like that employed in “The meat ought 
to be ready by now, since it has been in the oven for 90 minutes”. The system, 
in this case the oven with meat in it, is in a sub-optimal state if the meat is not 
ready — things are not then as they ought to be, something has gone wrong. 
The fact that the principles on which the functioning of the oven depends are 
physical laws, whereas the principles on which the signalling system depends are 
man-made conventions, is beside the point: in both cases the optimal functioning 
of the system will be defined relative to the main purpose the system is meant to 
achieve, and thus in both cases failure to satisfy the main purpose will represent 
a less-than-optimal situation. 

Suppose that agents j and k are users of an indicative signalling system s, and 
that they are mutually aware that, according to the signalling conventions gov- 
erning s, the performance by one of them of the act of seeing to it that C is meant 
to indicate that the state of affairs described by A obtains. The question of just 
what kind of act ‘seeing to it that C is will be left quite open. All that matters is 
that, by convention (in s), seeing to it that C counts as a means of indicating that 
A obtains. The content of the convention which specifies the meaning, in s, of j’s 
seeing to it that C will be expressed using a relativised ‘counts as’ conditional 
(see, for a detailed formal account, [19]), relativised to s, with the sentence EjC 
as its antecedent, where EjC is read ‘j sees to it that C’ or ‘j brings it about 
that C’ 2 . How, then, is the form of the consequent to be represented? The com- 
municative act is an act of asserting that A, and thus counts as an undertaking 
to the effect that the state of affairs described by A obtains. As proposed in the 
previous paragraph, this is interpreted as meaning that, when the communicative 
act EjC is performed, s’s being in an optimal state would require that the sen- 
tence A be true. So the form of the signalling convention according to which, in 
s, j’s seeing to it that C counts as an undertaking to the effect that A , is given by 



The logic of the relativised action operator is given in [19] and [17]. The best available 
introduction to this kind of approach to the logic of agency is to be found in [11]. 
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(sc-assert) EjC => s I* A (1) 

where I* is a relativised optimality, or ideality, operator (a normative operator 
of the evaluative kind 3 ), I* A expresses the proposition that, were s to be in an 
optimal state relative to the function s is meant to fulfil, A would have to be 
true, and => 5 is the relativised ‘counts as’ conditional. 

We state informally some assumptions we associate with (sc-assert). First, 
signalling system s is likely to contain a number of other conventions of the same 
form, according to which j’s seeing to it that C' counts as an undertaking to 
the effect that A', j’s seeing to it that C" counts as an undertaking to the effect 
that A", ... and so on. So the conventions expressed by conditionals of form 
(sc-assert) may be said to contain the code associated with indicative signalling 
system s — the code that shows what particular kinds of assertive signalling acts 
in s are meant to indicate. We might then also say that s itself is constituted by 
this code. Secondly, we assume that the (sc-assert) conditionals constituting s 
hold true for any agent j in the group U of agents who use s; that is, each agent 
in U may play the role of communicator. Thirdly, we assume that the members 
of U are all mutually aware of the (sc-assert) conditionals associated with s 4 . 

2.2 Communicator and Audience 

Suppose that j and k are both users of signalling system s, and that (sc-assert) 
is any of the signalling conventions in s. Then we adopt the following schema: 

((EjC => s I* A) A B k EjC) -» B k I*A (2) 

The import of the schema is essentially this: if k (the audience) believes that j 
performs the communicative act specified in the antecedent of (sc-assert), then 
k will accept that the consequent of (sc-assert) holds. He believes, then, that 
were signalling system s to be in an optimal state, A would be true. Another 
way of expressing the main point here is as follows: since k is familiar with the 
signalling conventions governing s, he is aware of what j’s doing C is meant to 
indicate, and so, when k believes that j has performed this act, k is also aware 
of what would then have to be the case if the reliability of j’s assertion could be 
trusted. This is not of course to say that k will necessarily trust j’s reliability, 
but if he does so he will then also go on to form the belief that A. In summary, 
assuming (sc-assert) and (2), and supposing that 

B k EjC (3) 

it now follows that 

B k IfA (4) 

3 On the distinction between evaluative and directive normative modalities, see [17]. 
For the logic of the I* operator we adopt a (relativised) classical modal system of 
type EMCN. As is shown in [8], a classical system of this type is identical to the 
smallest normal system K. For details, see [17]. 

4 See [17] for some remarks on the analysis of mutual belief. 
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If k now also trusts the reliability of j ’s assertion, k goes on to form the belief 

B k A (5) 

This type of trust is to be distinguished from ‘trust-in-sincerity’. For we may say 
that, in this same communication situation, if k also trusts the sincerity of j’s 
assertion, k goes on to form the belief: 

B k BjA (6) 

Note the various possibilities here: k might trust neither the reliability nor the 
sincerity of j’s assertion, in which case neither (5) nor (6) holds. Alternatively, 
k might trust j’s sincerity without trusting the reliability of his assertion ((6), 
but not (5)), or k might trust the reliability of j’s assertion without trusting j’s 
sincerity ((5) but not (6)). The latter case may arise if, for instance, k believes 
that the source of information supplying j is indeed reliable, even though he (k) 
also believes that j does not think the source is reliable. Finally, of course, k 
might trust both the reliability and the sincerity of j’s assertion. 

Note, furthermore, that the set of four trust positions we have just indicated 
may be expanded into a larger set of positions, depending on whether or not j 
is in fact reliable and in fact sincere 5 . 

It can readily be seen that, in contrast to the approach advocated in the FIPA 
COMMUNICATIVE ACT LIBRARY SPECIFICATION [XC00037G, 29.01. 
200 1] 6 , the present account of asserting makes no assumptions about the sin- 
cerity of the communicator. Furthermore, there is no assumption to the effect 
that j, when performing the act EjC, intends thereby to produce in k one or 
both of the beliefs (5) and (6). Indeed the only background assumption about 
the communicator’s intention that is implicit in this account is that k, when 
forming the belief represented by (4), supposes that j’s communicative act is to 
be taken as a serious, literal implementation of the governing convention (sc- 
assert); i.e., k does not think that j is play-acting, communicating ironically, 
talking in his sleep, etc. In such non-literal communication situations there are 
good reasons (which will not be developed here) for supposing that (2) does not 
hold for a rational audience k. One distinctive feature of the present approach 
is that this background assumption about the communicator’s intention can re- 
main implicit, since the mechanism by means of which assertoric signalling is 
effected turns essentially on the governing signalling conventions — the publi- 
cally accessible rules which show what particular types of communicative acts 

5 The use of the term ‘position’ here is quite deliberate, alluding to the theory of 
normative positions, and in particular to some well studied techniques for generating 
an exhaustive characterisation of the class of logically possible situations which may 
arise for a given type of modality (or combination of modalities), for a given set of 
agents, vis-a-vis some state(s) of affairs. See, e.g., [18] and [28] for illustrations of 
the development and application of the generation procedure. A more comprehensive 
account of the concept of trust, which incorporates the notion of ‘trusting what 
someone says’, is presented in [16]. 

6 See http://www.fipa.org/ 
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are taken to indicate — rather than on the intentions of agents who employ those 
conventions 7 . 

It might also be observed that it is very natural indeed to adopt this back- 
ground assumption in the contexts for which the theory of ACLs is currently 
being developed. For the primary interest there is certainly not in non-literal 
communication, or in ‘communicating one thing but meaning another’, but in 
the literal (albeit quite possibly deceitful) usage of signals with public, conven- 
tional meanings. 

2.3 Commitment 

Some recent approaches to ACLs have assigned a key role to the notion of com- 
mitment (e.g., Singh in [29] and Colombetti in [9]), and it might be suggested 
that when an agent j asserts that A , his act counts as an undertaking to the 
effect that A is true in the sense that j commits himself to the truth of A. So 
it might be supposed that there is here an alternative way of understanding 
the essential rule governing asserting to that offered above in terms of the J* 
operator. 

However, this suggestion raises a number of difficulties. First, just what is 
meant by saying that an agent commits himself to the truth of some sentence 
A ? Does it mean that j is under some kind of obligation to accept that A is 
true ? If so, in relation to which other agents is this obligation held, i.e., who 
is it that requires of j that j shall accept the truth of A ? Everyone to whom 
he addresses his assertion ? Surely not, for there may well be members of the 
audience who do not care whether j is being sincere, and there may also be 
others who require j to be insincere: perhaps j is their designated ‘spokesman’ 
whom they have instructed to engage in deception when that strategy appears 
to meet their interests. Furthermore, since the current concern with ACLs is 
related to the design of electronic agents, it has to be said that there is very 
little agreement on what it might mean for an electronic agent to enter into a 
commitment. 

The view taken here is that the move towards agent commitment (as the 
basis for understanding the undertaking involved in an act of asserting) is the 
result of a confusion — a confusion which was already indicated by Fpllesdal [13] 
in his discussion of Stenius. The point is this: the reason why it is very commonly 
required of communicators that they shall tell the truth, or at least attempt to 
tell the truth as they see it, is that conformity to that requirement (that norm) 
will help to preserve the practice of asserting qua practice whose prime function 
is to facilitate the transmission of reliable information. But norms designed to 
preserve the practice should not be confused with the rules or conventions which 

7 Within the Philosophy of Language there has been a good deal of discussion of the 
relative merits of intention-based and convention-based approaches to the charac- 
terisation of communicative acts. This is not the place to enter into that discussion. 
Suffice it to say that FIPA’s approach to ACLs seems to have been heavily, per- 
haps one-sidedly, influenced by theories deriving in large mesure from the Gricean, 
intention-based theory of meaning. 
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themselves constitute the practice — the conventions whose very existence makes 
possible the game of asserting, and which determine that the performance of an 
instance of a given act-type counts as a means of saying that such-and-such a 
state of affairs obtains. An attempt to use the notion of communicator’s com- 
mitment to characterise the nature of asserting confuses preservative norms with 
constitutive conventions. To be sure, those conventions will eventually become 
de-valued, relative to the function they were designed to meet, if there is con- 
tinual violation of the preservative norms. But this should not be allowed to 
obscure the fact that it is the conventions, and not the preservative norms, that 
create the very possibility of playing the asserting game, in an honest way, or 
deceitfully. 

2.4 Some Other Types of Communicative Acts 

Asserting is of course just one type of communicative act among many. This 
section provides just a sketch of some other types, and certainly does not pretend 
to give anything like an exhaustive characterisation of communicative act- types. 
But it does illustrate the flexibility and expressive power of the logical framework 
here employed. We consider four types: 

— Commands 

— Commissives (placing oneself under an obligation, e.g., promising) 

— Requests 

— Declaratives (in the sense of Searle & Vanderveken, [27]) 

In each case, the governing signalling convention will take the form of (sc-assert) 
with, crucially, some further elaboration of the scope-formula A in the conse- 
quent. This means that each of these signalling act-types is a sub-species of the 
act of asserting — a consequence which is harmless, and which simply reflects the 
fact that all communicative acts are acts of transmitting information — informa- 
tion which may, or may not, be true. However, as will be suggested in section 2.5, 
there is nevertheless one very important difference between pure assertives and 
these sub-species. 

Commands 

Let j be the agent issuing the command, and let k be the agent who is com- 
manded to see to it that A. Then the form of the governing signalling convention 
is: 



(sc-command) EjC => s I*OEkA (7) 

where the ‘O’ operator is a directive normative modality representing obligation. 
So, according to (sc-command), if j sees to it that C, s would then be in an 
optimal state, relative to its function of facilitating the transmission of reliable 
information, if there were an obligation on k to see to it that A. 

Commissives 

Let j be the agent issuing the commissive. Then the form of the governing 
signalling convention is: 
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(sc-commit) EjC => s I*OEjA (8) 

So, according to (sc-commit), if j sees to it that C, s would then be in an 
optimal state, relative to its function of facilitating the transmission of reliable 
information, if j were himself under an obligation to see to it that A 8 . 

Requests 

Let j be the agent making the request, and let the aim of the request be to get 
agent k to see to it that A. Then the form of the governing signalling convention 
is: 



(sc-request) EjC => s I*HjEkA (9) 

where the relativised ‘H’ operator represents the modality ‘attempts to see to it 
that...’. So, according to (sc-request), if j sees to it that C, s would then be in an 
optimal state, relative to its function of facilitating the transmission of reliable 
information, if j were attempting to see to it that k sees to it that A 9 . 

Declaratives 

These are the kinds of signalling acts that are performed by, for instance, the 
utterance of such sentences as: 

— ‘I pronounce you man and wife’ 

— ‘I name this ship Generalissimo Stalin’ 

— ‘I pronounce this meeting open’ 

The point of declaratives is to create a new state of affairs, which will itself 
often carry particular normative consequences concerning rights and obligations, 
as when two persons become married, or a meeting is declared open. In the 
spirit of the approach developed in [19], we may say that declaratives are used 
by designated agents within institutions as a means of generating institutional 
facts: facts which, when recognised by the institution as established, are deemed 
to have particular kinds of normative consequences. 

Let j be the agent issuing the declarative, and let A describe the state of 
affairs to be created by the declarative. Then the form of the governing signalling 
convention is: 



(sc-declare) EjC => s I*EjA (10) 

So, according to (sc-declare), if j sees to it that C, s would then be in an 
optimal state, relative to its function of facilitating the transmission of reliable 
information, if j sees to it that A. For instance, j utters the words ‘I pronounce 
you man and wife’, and then s’s being in an optimal state would require that j 
has indeed seen to it that the couple are married. 

8 On the logic of the directive normative modality, see [17]. 

9 On the logic of the ‘attempts to see to it that...’ modality, see [17]. 
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2.5 Being Empowered 

For each of the four types just considered, if j is an empowered/ authorised agent, 
then the mere performance by j of the act of seeing to it that C will be sufficient 
in itself to guarantee the truth of the respective formula to the right of the I* 
operator 10 . For instance, if j is empowered/authorised to command k, then his 
seeing to it that C will indeed create an obligation on k to do A. Likewise, if 
j is empowered/authorised to commit himself, then performing the appropriate 
communicative act will be enough to place himself under an obligation. And if 
j is empowered/authorised to make a request to k, then his communicative act 
will constitute an attempt to get k to do the requested act. And so on. 

Here lies the key to the crucial difference, alluded to above, between pure 
assertions and the other types of communicative act. For pure assertions, there 
is no notion of empowerment or authorisation which will license the inference of 
A from the truth of If A. The closest one could get to such a notion would be 
the case where j is deemed to be an authority on the subject about which he is 
making an assertion: but even then, his saying that A does not make it the case 
that A 11 . 

We have now outlined a new formal approach to the theory of ACLs, in 
which a class of signalling conventions, governing some distinct types of com- 
municative acts, can be represented. Other types of communicative act remain 
to be characterised. But we now turn to the task of embedding this ‘static’ ac- 
count of communication within a theory of conversation, in which sequences of 
inter-related signalling acts are transmitted. 

3 Modelling Conversations 

Conversations are essentially dynamic in nature. In this section, we outline one 
possible way of adding a dynamic dimension to the theory of signalling acts, by 
combining it with the arrow logic of van Benthem [32-34] and colleagues [35, 
24], 

Our proposal is twofold. First, we suggest giving a compact expression to 
conversation protocols, by means of a formula of the object-language. Second, we 
suggest using this kind of representation to provide the beginning of a procedure 
for keeping a record of the conventional effects achieved in a conversation 12 . 

10 We leave implicit here the obvious point that, in many cases, the communicative act 
has to be performed in a particular context — e.g., in the presence of witnesses — if 
it is to achieve its conventional effect. 

11 This is an old idea in a new guise. A number of early contributors to the literature 
on performatives (Lemmon, Aqvist and Lewis, among them) suggested that the 
characteristic feature of performatives, in contrast to constatives, was ‘verifiability 
by use’, or the fact that ‘saying makes it so’. See [15] for references. 

12 Of course, the account outlined in this paper can only be suggestive of how future 
work should proceed. For instance, the account says nothing about the specific cri- 
terion the agent should apply in choosing which utterance will constitute its next 
contribution. For a discussion of this issue, see P.E. Dunne and P. McBurney’s con- 
tribution to this volume ([10]). 
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The reason why we do not use dynamic logic in its traditional form (see 
Pratt [25]), is that it presupposes a kind of approach to the logic of agency that 
is very different from the treatment provided in the theory of signalling acts. 
As indicated in section 2.1, the present framework treats agency as a modal 
operator, with some reading such as ‘agent j sees to it that’. Dynamic logic 
has explicit labels for action terms. These are not propositions but (to put it in 
Castaneda’s terms) practitions. 

It might well be the case that temporal logic provides a better account than 
arrow logic. Exploration of this second possibility is the main focus of our current 
investigations. The reason why we have chosen to concentrate first on arrow 
logic is that, when moving to the dynamics, we do not have to redefine the main 
ingredients of the semantics used for the static account. Indeed all we need to do 
is to interpret the points in a model as transitions. The completeness problem 
for the integrated framework is, then, relatively easy. 



3.1 Embedding the Static Account within Arrow Logic 

The syntax of arrow logic has in general the following three building blocks: 

— a binary connective denoted by o referred to as “composition” (or “circle”); 

— a unary connective denoted by “ referred to as “reverse” (or “cap”); 

— a propositional constant denoted by Id referred to as “identity” . 

The sentences that replace A, B , ..., that the first two connectives take as argu- 
ments, are supposed to describe an event, an action, etc. More expressive modal 
operators can be added into the vocabulary of the logic. For present purposes, 
we need not introduce them. Suffice it to observe that this way of turning the 
static account into a dynamic account is very natural, because a frame in arrow 
logic is no more than an ordinary Kripke frame. The only difference is that the 
universe W is viewed as consisting of arrows. These are not links between pos- 
sible worlds. In fact they are treated themselves as the possible worlds 13 . As far 
as the ‘dynamification’ of the static framework is concerned, it then suffices to 
keep the package of truth-clauses already employed in the static framework, and 
to introduce those usually used for the three new building blocks. 

The full account of the framework will be the focus of attention in a longer 
report on this work. Here we will characterise the arrow formalism only in terms 
of its proof-theory, and in terms of the graphics which help to give an intuitive 
account of the three new building blocks. Semantically, the introduction of these 
modalities is straightforward, by adding relations between arrows. 

For instance, the evaluation rule for o (“circle”) states that A o B is true at 
an arrow a iff it can be decomposed into two arrows at which A and B hold, 
respectively. This can be pictured as in figure 1: 



13 



In this approach, arrows are not required to have some particular internal structure 
(to be “ordered pairs”, for instance). 
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Tn 0 : A 



In 7 : B 



In a : A o B 
Fig. 1. Composition 



The intended meaning of this connective is relatively transparent. A sentence of 
the form Ao B can be read as meaning that the event described by A is followed 
by the event described by B. The two arrows at which A and B are evaluated 
can be seen as two intervals (periods of time). 

Next, the evaluation rule for Id (“identity”) says that, for Id to be true at 
a, a must be a transition that does not lead to a different state. This can be 
pictured as follows: 




Fig. 2. Identity 

Finally, the truth-clause for v (“reverse”) says that, for Al to be true at a, 
there must be an arrow (3 that is the reversal of a and at which A holds: 



In a : A3 




In 13: A 
Fig. 3. Reverse 

It is natural to say that such an operator has the meaning of ‘undo-ing’ an 
action. In figure 3, arrow (3, at which A is true, leads from one state to another. 
Intuitively, the endpoint of (3 contains the effects of the performance of A in (3. 
Arrow a, at which A3 is true, goes in the opposite direction, so that the effects 
of the performance of A in transition (3 are cancelled. Of course, we give this 
model for heuristic purposes only, since the formalism is not expressive enough 
to allow us to reason about states as well. However, it is possible (at least in 
principle) to remove this limitation, by switching to so-called two-sorted arrow 
logics. Introduced in van Benthem [33], these are designed for reasoning about 
both states and transitions. It seems very natural to try to refine the formalism 
in such a way that what obtains within states is also taken into account. We 
shall explore this issue in future research. 

We now turn to the axiomatic characterization of the framework. When no 
particular constraints are imposed on the semantical counterparts of the dynamic 
operators, the proof theory of the integrated framework can in fact be obtained 
by adding the following rules of inference and axiom schemata to the basic logic: 
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Rules of inference 

b B -> C 
b (AoB) -> (AoC) 

b A 



(rl) 



b A 



C 



(r2) 



b A 



B 



I — i (— 'A o B ) 
Axiom schemata 



(r4) 



b (AoB) — > (CoB) b (A)“ — > (B)' 

b A b A 



(r3) 



I — <(B o —i A) 



(r5) 



b -n((^n 



(r6) 



b {A V B) o C 


-> {A o C) V (B o C) 


(al) 


b4o(BV C) 


-> (A o B) V (A o C) 


(a2) 


b (A V BY 


A' V B\ 


(a3) 



The first three rules express a principle of closure under logical consequence. The 
next three are the arrow counterparts of the necessitation rule. Axioms (al)-(a3) 
say that o and ” distribute over V . The converse of implication (al) can easily 
be derived by using (r2), and similarly for (a2) and (a3). 

A proof of soundness and (strong) completeness for the extended framework 
will be included in a longer report on our work, in preparation. The proof is 
based on the standard technique of canonical model construction. 



3.2 The English Auction Protocol 

In this section, we illustrate the expressive capacity of the logic, by showing how 
it can be applied to the analysis of a conversational protocol. We focus on what 
are called English auctions, at least as a starting point. We here give the basic 
idea of the treatment. 

Figure 4 below depicts the English Auction Protocol used between an auc- 
tioneer agent a and each agent buyers b. The nodes (circles) represent states of 
the conversation, and the arcs (lines) represent speech acts that cause transition 
from state to state in the conversation. The circles with a double-line represent 
the final states of the conversation. 

The propositional letters attached to the arcs are notational shorthand for 
the following speech acts: 

— A: a puts item c up for auction; 

— B: b makes a bid; 

— C: a informs b that the item is sold to another buyer; 

— D: a declares that the auction is at an end; 

— E: a informs b that another buyer overbids; 

— F: a informs b that his bid wins. 

We use propositional letters for clarity’s sake only. In fact, A corresponds to 
the antecedent of a conventional signalling rule of type (sc-declare), and likewise 
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Fig. 4. English Auction Protocol 



for D. B is to be replaced by the antecedent of a signalling convention taking 
the form of (sc-commit). The scope formula in the consequent uses a conditional 
obligation, 0 {Ef,A 2 /A\), according to which b is under the obligation to pay if his 
offer is accepted. We leave aside discussion of the problem of how to analyse the 
conditional obligation operator 0{/) (an elaborate formal treatment is available 
in [6]) 14 . C, E and F each correspond to the antecedent of a signalling convention 
taking the form of (sc-assert). 

The main function of a protocol is to define the sequences of speech acts that 
are permissible during a conversation. The basic idea is to assume that such se- 
quences can be expressed in a compact way, by means of a disjunction containing 
o, " and/or Id. For instance, the English Auction Protocol is an instantiation of 
the formula 

(AoD) V (AoC) V (Ao(Bo F)) V (Ao(Bo(Eo C))) (11) 

where (as we have just indicated) A , B , C, D , E and F stand for the antecedents 
of the appropriate signalling conventions. Since o distributes over V , (11) can 
be simplified into 



A o D V C V (B o (F V (E o C ))) (12) 

(11) considers in isolation the sequences of acts that are allowed by the protocol. 
The first disjunct in (11), AoD , translates the path 1-2-5. The second disjunct, 
AoC, translates the path 1-2-4. The third disjunct, A o (B o F) translates the 
paths 1-2-3-6. The fourth and last disjunct, A o (B o (£ o C)), translates the 

14 The formalization of contrary-to-duty (CTD) scenarios raises a problem that is usu- 
ally considered as a hard one by deontic logicians. We note in passing that the concept 
of norm violation has an obvious counterpart in commitment-based approaches to 
conversation. In particular, Mallya et. al.’s contribution to this volume (see [23]) 
puts some emphasis on the notion of breach of a commitment in a conversation. It 
would be interesting to investigate what such frameworks have to say about CTD 
scenarios. 




14 Andrew J.I. Jones and Xavier Parent 

path 1-2-3-2-4. Formula (12) puts the sequences of speech acts together, and 
indicates the points when interactants have the opportunity to choose between 
two or more speech acts. (12) can be read as follows. Once A has been done, then 
we can have either D , C or B. And once B has been done, we can have either 
F or Tf-followed-by-C. For simplicity’s sake, we assume here that auctioneer a 
receives at most two bids. The fact that auctioneer a can receive more than two 
bids might be captured by an operator expressing iteration. 

As the auction evolves, there is a shift in focus from the whole disjunction ( 11 ) 
to one specific disjunct. The latter records the acts (which are not necessarily 
verbal) performed in a conversation. It seems reasonable to expect a formal 
language for ACLs to also provide a way of keeping a record of the conventional 
effects achieved by these acts. As a further refinement, the recording might take 
into account the fact that users of signalling system s are empowered agents, 
or the fact that one agent j trusts some other agent k. Although we need to 
subject this issue to further investigation, we can already give some hint of how 
such a record can be achieved in the present framework. It consists in using a 
construction proposed by Fitting [12] 

S \= U —* X (13) 

which exploits the idea that the local and the global consequence relations used 
in modal logic can be subsumed under one more general relation. The formal def- 
inition of this notion can easily be adapted to the present framework. Intuitively, 
S expresses global assumptions, holding at all arrows. In contrast, U enumerates 
local assumptions, holding at particular arrows. In line with our previous anal- 
ysis — see section 2.1 — we assume that S contains the signalling conventions 
adopted by institution s. These are mutually believed by the agents who use s. 
Here, S plays the role of a black box that takes U (a sequence of communicative 
acts) as input and gives X (a list of conventional effects) as output. For instance, 
if the focus is on the sequence io(Bof), then S is the set having the following 
three elements: 



E a A 1 => s I*E a Ai 


(14) 


E b A 2 =► s I*0(E b A 6 /A 5 ) 


(15) 


E a A 3 => s I*A 7 


(16) 



Now let us adopt the point of view of an external observer x. This means that 
we can specify U in (13) as 



B x E a A\ o (B x E b A 2 o B x E a A 3 ) (17) 

As can easily be verified, the doxastic form of modus ponens ( 2 ) used in the 
‘static’ framework allows us to specify X in (13) as 

B x I*E a A 4 o ( B x I* s O(E b A 6 /A 5 ) o B X I*A 7 ), (18) 



which represents a record of the conventional effects achieved in the conversation. 
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Depending on x’s beliefs about the empowerment and trustworthiness of 
the communicators a and b, the record will include some further features. For 
instance, if x believes that a and b are empowered to declare and commit, re- 
spectively, and if x also believes that o’s assertion of A 7 is trustworthy (reliable), 
then the record will also show: 



B x E a Ai o ( B x O(EbAe/A5 ) o B x At). (19) 

One last remark is to be made. So far we have used only the operator o, in 
order not to distract the reader from the main point we wish to make in this 
paper. It is possible to use the other two operators, Id and “, so as to capture 
further aspects of the protocol. The modal constant Id can be used to capture 
the obvious fact that, once a has suggested a starting-price for the goods, it may 
happen that another agent, call it b\ opens the bid. (In this case, all b sees is 
the new announcement.). Operator “ can be used to express the fact that, once 
E has been performed, the conversation returns to the prior state 2. Finally, 
it should be mentioned that the presence of a potential cycle might easily be 
captured by using the unary connective usually denoted by * and referred to 
as “iteration” (also “Kleene star”). We defer the full discussion of this issue to 
another occasion. 

4 Concluding Remarks 

Although the ‘dynamic’ account outlined in this paper is preliminary, we believe 
it points the way to a comprehensive theory of conversation, providing guid- 
ance to protocol designers. At the dynamic level, we have basically proposed a 
compact expression of conversation protocols, by using arrow logic. Although we 
need to subject this point to further investigation, we are inclined to think that 
this kind of representation will be able to facilitate the systematic comparison 
of protocols. 

In closing, let us add one further remark in connection with the second sug- 
gestion we have made. It is that the record process should take into account 
questions about whether users of signalling system s are empowered agents, or 
questions about whether one agent j trusts some other agent k. Considerations 
of the first type become particularly relevant when, for instance, we focus on 
those situations where agents buy and sell goods on behalf of some other agents. 
In recent years, we have seen the development of a number of systems that make 
it possible to advertise and search for goods and services electronically. Let us 
take the case of the MIT’s Kasbalr prototype [7]. It is a Web-based system where 
users create autonomous agents to buy and sell goods on their behalf. Each of 
these agents is autonomous in that, once released into the marketplace, it ne- 
gotiates and makes decisions on its own, without requiring user intervention. 
Suppose agent k makes a bid on behalf of user j. The background signalling 
convention (governing /c’s communicative act) takes the form 



(E k C =*► s I*E k EjA) A (EjA =>, I*EjB) 



(20) 
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If k is empowered to make an offer (if, for instance, a time-out has not taken 
place), then the truth of E^EjA (and, hence, the truth of EjA) is guaranteed. 
If user j is empowered as buyer (if j is not under age, or if j’s credit is greater 
than or equal to the price of the good), then the truth of EjB also obtains. 
Here the idea is to classify the performance of a communicative act as valid 
or invalid according to whether or not the agent that performed the action 
was institutionally empowered. Some work along these lines has already been 
conducted in the context of the study of the Contract-Net-Protocol (see Artikis 
et al. [1,2]). A thorough investigation of the relation between that account and 
the one outlined in this paper remains to be done. 
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Abstract. This paper proposes a new model of communication in mul- 
tiagent systems according to which the semantics of communication de- 
pends on their pragmatics. Since these pragmatics are seen to result from 
the consequences of communicative actions as these have been empiri- 
cally observed by a particular agent in the past, the model is radically 
empirical, consequentialist and constructivist. A formal framework for 
analysing such evolving semantics is defined, and we present an exten- 
sive analysis of several properties of different interaction processes based 
on our model. Among the main advantages of this model over traditional 
ACL semantics frameworks is that it allows agents to reason about the 
effects of their communicative behaviour on the structure of commu- 
nicative expectations as a whole when making strategic decisions. Also, 
it leads to a very interesting domain-independent and non-mentalistic 
notion of conflict. 



1 Introduction 

One of the main challenges in the definition of speech-act based [1] agent com- 
munication language (ACL) semantics is explaining the link between illocution 
and perlocution, i.e. describing the effects of utterances (those desired by the 
sender and those brought about by the recipient of the message) solely in terms 
of the speech acts used. Various proposed semantics suggest that it is necessary 
to either resort to the mental states of agents [4, 3, 20] or to publicly visible com- 
mitments [5, 6, 8, 15, 19, 10, 21] in order to capture the semantics of speech acts, 
i.e. to aspects of the system that are external to communication itself. 

In the context of open large-scale multiagent systems (MAS) characterised by 
dynamically changing populations of self-interested agents whose internal design 
is not accessible to others, it is not clear how specifications of mental attitudes 
or systems of commitments can be linked to the observed interactions. How 
can we make predictions about agents’ future actions, if the semantics of their 

* This work is supported by Deutsche Forschungsgemeinschaft (German Research 
Foundation) under contract No. Br 609/11-2. An earlier version of this paper has 
appeared in [17]. 
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communication is defined in terms of mental states or commitments not related 
to the design of these agents? 

In this paper, we suggest a view of communication that is a possible re- 
sponse to this problem. This view is based on abandoning the classical notion 
of “meaning” of utterances (in terms of “denotation”) and the distinction be- 
tween illocution and perlocution altogether in favour of defining the meaning of 
illocutions solely in terms of their perlocutions. 

Our view of communication is 

1. consequentialist, i.e. any utterance bears the meaning of its consequences 
(other observable utterances or physical actions), 

2. empirical, since this meaning is derived from empirical observation, and 

3. constructivist , because meaning is always regarded from the standpoint of a 
self-interested, locally reasoning and (boundedly) rational agent. 

By grounding meaning in interaction practice and viewing semantics as an emer- 
gent and evolving phenomenon, this model of communication has the capacity 
to provide a basis for talking about agent communication that will prove useful 
as more and more MAS applications move from closed to open systems. Its prac- 
tical use lies in the possibilities it offers for analysing agent interactions and for 
deriving desiderata for agent and protocol design. At a more theoretical level, 
our framework provides a very simple link between autonomy and control and 
introduces a new, powerful notion of conflict defined in purely communicative 
terms, which contrasts mentalistic or resource-level conflict definitions such as 
those suggested in [12], As a central conclusion, “good” protocols are proven 
to be both autonomy-respecting and contingency-reducing interaction patterns, 
which is shown through an analysis of example protocols with our framework. 

The remainder of this paper is structured as follows: section 2 presents the 
assumptions underlying our view of communication, and in section 3 we lay 
out requirements for agents our model is suitable for. Sections 4 and 5 describe 
the model itself which is defined in terms of simple consequentialist semantics 
and entropy measures. An analysis of several interaction scenarios follows in 
section 6, and we round up with some conclusions in section 7. 



2 Basics 

To develop our model of communication, we should first explain the most im- 
portant underlying assumptions. 

Firstly, we will assume that agents are situated in an environment that is co- 
inhabited by other agents they can communicate with. Agents have preferences 
regarding different states of the world, and they strive to achieve those states 
that are most desirable to them. To to this end they deliberate, i.e. they take 
action to achieve their goals. Also, agents’ actions have effects on each other’s 
goal attainment - agents are inter- dependent. 

Secondly, we will assume that agents employ causal models of communica- 
tive behaviour in a goal-oriented fashion. In open, dynamic and unpredictable 
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systems, it is useful to organise experience into cause-and-effect models (which 
will depend much more on statistical correlation rather than on “real” causal- 
ity) of the behaviour of their environment in order to take rational action (in a 
“planning” sense of means-ends reasoning). This is not only true of the physical 
environment, but also of other agents. 

In the context of this paper, we will consider the foremost function of com- 
munication to be to provide such a causal model for the behaviour of agents in 
communication. These communicative expectations can be used by an agent in a 
similar way as rules that it discovers regarding the physical environment. 

Thirdly, there are some important differences between physical action exe- 
cuted to manipulate the environment and communicative interaction, i.e. mes- 
sages exchanged between agents: 

1. The autonomy of agents stands in contrast to the rules that govern physical 
environments - agents receive messages but are free to fulfil or disappoint 
the expectations [2] associated with them. An agent can expect his fellow 
agents to have a model of these expectations, so he can presume that they are 
deliberately violating them whenever they are deviating from expectations. 
This stands in clear contrast to the physical environment which may ap- 
pear highly non-deterministic but is normally not assumed to reason about 
whether it should behave the way we expect it to behave. 

2. Communication postpones “real” physical action 1 : it allows for the establish- 
ment of causal relationships between messages and subsequent messages or 
physical actions. 

This enables communicating agents to use messages as symbols that “stand 
for” real 2 action without actually executing it. Hence, agents can talk about 
future courses of action and coordinate their activities in a projective fash- 
ion before these activities actually occur. This can be seen as a fundamental 
property of communication endowing it with the ability to facilitate cooper- 
ation. 

With this in mind, we make the following claims: 

1. Past communication experience creates expectations for the future. 

2. Agents employ information about such expectations strategically. 

3. Communicative expectations generalise over individual observations. 

4. Uncertainty regarding expectations should be reduced in the long run. 

5. Expectations that hinder the achievement of agent goals have to be broken. 

Claim 1 states that causal models can be built by agents from experience and 
used for predicting future behaviour. Many representations for these models can 

1 Of course, communication takes place in physical terms and hence is physical action. 
Usually, though, exchanging messages is not supposed to have a strong impact on 
goal achievement since it leaves the physical environment virtually unmodified. 

2 Note that “real” action can include changes of mental states, e.g. when an agent 
provides some piece of information to another and expects that agent to believe him. 
For reasons of simplicity, we will restrict the analysis in this paper to communicative 
patterns that have observable effects in the physical environment. 
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be conceived of, like, for example, expectation networks which we have suggested 
in [13]. Statement 2 is a consequence of 1 and the above assumptions regarding 
agent rationality: we can expect agents to use any information they have to 
achieve their goals, so this should include communicative expectations. 

The first claim that is not entirely obvious is statement 3 which points at 
a very distinct property of communicative symbols. It implies that in contrast 
to other causal models, the meaning of symbols used in communication is sup- 
posed to hold across different interactions. Usually, it is even considered to be 
identical for all agents in the society (cf. sociological models of communication 
[9, 11]). The fact that illocutions (which usually mark certain paths of interac- 
tions) represented by performatives in speech act theory are parametrised with 
“sender” and “recipient” roles conforms with this intuition. Without this gener- 
alisation (which is ultimately based on a certain homogeneity assumption among 
agents [11]), utterances would degenerate to “signals” that spawn particular re- 
actions in particular agents. Of course, agents may maintain rich models of indi- 
vidual partners with whom they have frequent interactions and which specialise 
the general meaning of certain symbols with respect to these particular agents. 
However, since we are assuming agents to operate in large agent societies, this 
level of specificity of symbol meaning cannot be maintained if the number of 
constructed models is to be kept realistically small - agents are simply forced 
to abstract from the reactions of an individual agent and to coerce experiences 
with different agents into a single model. 

Claims 4 and 5 provide a basis for the design criteria applied when build- 
ing agents that are to communicate effectively. Unfortunately, though, the goals 
they describe may be conflicting. Item 4 states that the uncertainty associated 
with expectations should be kept to a minimum. From a “control” point of view, 
ideally, an agent’s peers would react to a message in a mechanised, fully pre- 
dictable way so that any contingency about their behaviour can be ruled out. 
At the same time, the agent himself wants to be free to take any decision at 
any time to achieve his own goals. Since his plans might not conform with ex- 
isting expectations, he may have to break them as stated by statement 5. Or he 
might even desire some other peer to break an existing expectation, if, for exam- 
ple, the existing “habit” does not seem profitable anymore. We can summarise 
these considerations by viewing any utterance as a request, and asking what is 
requested by the utterance: the confirmation, modification or novel creation of 
an expectation. 

These considerations lead to several desiderata for semantic models of com- 
munication: 

— The meaning of a message can only be defined in terms of its consequences, 
i.e. the messages and actions that are likely to follow it. Two levels of effects 
can be distinguished: 

1. The immediate reactions of other agents and oneself to the message. 

2. The “second-order” impact of the message on the expectation structures 
of any observer, i.e. the way the utterance alters the causal model of 
communicative behaviour. 
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— Any knowledge about the effects of messages must be derived from empirical 
observation. In particular, a semantics of protocols cannot be established 
without taking into account how the protocols are used in practice. 

— Meaning can only be constructed through the eyes of an agent involved in 
the interaction, it strongly relies on relating the ongoing communication to 
the agent’s own goals. 

Following these principles, we have developed a framework to describe and anal- 
yse communication in open systems that will be introduced in the following 
sections. 

3 Assumptions on Agent Design 

3.1 The InFFrA Social Reasoning Architecture 

In order to present the view of communication that we propose in this paper, 
we first need to make certain assumptions regarding the type of agents it is 
appropriate for. For this purpose, we shall briefly introduce the abstract social 
reasoning architecture InFFrA that has previously been described in full detail 
in [18]. We choose InFFrA to describe this view of communication, because it 
realises the principles laid out in the previous section, while making only fairly 
general assumptions about the kind of agents our models are suitable for. 

InFFrA is based on the idea that agents organise the interaction situations 
they find themselves into so-called interaction frames [7], i.e. knowledge struc- 
tures that represent certain categories of interactions. These frames contain in- 
formation about 

— the possible interaction trajectories (i.e. the courses the interaction may take 
in terms of sequences of actions/messages), 

— roles and relationships between the parties involved in an interaction of this 
type, 

— contexts within which the interaction may take place (states of affairs before, 
during, and after an interaction is carried out) and 

— beliefs , i.e. epistemic states of the interacting parties. 

While certain attributes of the above must be assumed to be shared knowledge 
among interactants (so-called common attributes) for the frame to be carried 
out properly, agents may also store their personal experience in a frame (in the 
form of private attributes), e.g. utilities associated with previous frame enact- 
ments, etc. What makes interaction frames distinct from interaction protocols 
and conversation policies is that 

(i) they provide comprehensive characterisations of an interaction situation 
(rather than mere restrictions on the range of admissible message sequences), 

(ii) they always include information about experience with some interaction pat- 
tern, rather than just rules for interaction. 

Apart from the interaction frame abstraction, InFFrA also offers a control flow 
model for social reasoning and social adaptation based on interaction frames, 
through which an InFFrA agent performs the following steps in each reasoning 
cycle: 
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1. Matching: Compare the current interaction situation with the currently ac- 
tivated frame. 

2. Assessment: Assess the usability of the current frame. 

3. Framing decision: If the current frame seems appropriate, continue with 6. 
Else, proceed with 4. 

4. Re-framing: Search the frame repository for more suitable frames. If candi- 
dates are found, “mock-activate” one of them and go back to 1; else, proceed 
with 5. 

5. Adaptation: Iteratively modify frames in the repository and continue with 4. 

6. Enactment: Influence action decisions by applying the current frame. Return 
to 1. 

This core reasoning mechanism called framing that is supposed to be performed 
by InFFrA agents in addition to their local goal-oriented reasoning processes (e.g., 
a BDI [16] planning and plan monitoring unit) is reasonably generic to cater for 
almost any kind of “socio-empirically adaptive” agent design. 

Using the InFFrA architecture, we can specify a “minimal” set of properties for 
agent design to be in accordance with the principles laid out for our framework 
in section 2. 

3.2 “Minimal” InFFrA Agents 

The simplest InFFrA-compliant agent design that can be conceived of is as fol- 
lows: we consider agents that engage in two-party turn-taking interactions that 
occur in discrete time and whose delimiting messages/actions can always be de- 
termined unambiguously. This means that agents always interact only with one 
peer at a time, that these encounters consist of a message exchange in which 
agents always take turns, and that an agent can always identify the beginning 
and end of such an encounter (e.g. by applying some message timeout after which 
no further message from the other agent is expected anymore). 

We also assume the existence of some special “deictic” message performative 
do(A, X) that can be sent by agent A to indicate it is executing a physical (i.e. 
non-communicative) action X in the environment. More precisely, do(A,X) is 
actually a shortcut for an observation action of the “recipient” of this message 
by which he can unambiguously verify whether A just executed X and which he 
interprets as part of the encounter; it need not be some distinguished symbol 
that has been agreed upon. 

Further, we assume that agents store these encounters as “interaction frames” 
F = (U, w, h) in a (local) frame repository T where C is a condition, w is a 
message sequence and h is a vector of message counters. 

The message sequence of a frame is a simple kind of trajectory that can be 
seen as a word w £ S* from some alphabet of message symbols S (which include 
the do-symbols that refer to physical actions). Although agents may invent new 
symbols and the content language of messages (e.g. first-order logic) may allow 
for an infinite number of expressions, S is finite, since it always only contains 
symbols that have already occurred in previous interactions. 

Since specific encounters are relevant/possible under particular circumstances 
only, we assume that the agent has some knowledge base KB the contents of 
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which are, at any point in time, a subset of some logical language L, i.e. KB £ 2 L . 
Then, provided that the agent has a sound inference procedure for L at its 
disposal, it can use a condition (expressed by a logical formula C £ L) to restrict 
the scope of a message sequence to only those situations in which C holds: 

(C, w, h) £ T <t=> (KB (= C =4> w can occur ) 

In practice, C is used to encode any information about roles and relationships, 
contexts and beliefs associated with a frames as described in section 3.1. 

As a last element of the frame format we use, agents employ “usage counters” 
h £ for each message in a frame trajectory. The counter values for all 

messages in some prefix trajectory sequence w £ S* is incremented in all frames 
who share this prefix word whenever w occurs, i.e. 

(w has occurred n times A |ie| = i) => 

V(C,wv,h) £ TMi < \w\.hi = n 

(for some v £ £*). This means that h is an integer-valued vector that records, 
for each frame, how often an encounter has occurred that started with the same 
prefix w (note that during encounters, hi is incremented in all frames that have 
shared prefixes w if this is the message sequence just perceived until the ith 
message). Therefore, count(F)[i] > count(F)[i + 1] for any frame F and any 
i < \traj(F)\ (we use functions cond(F), traj(F) and count(F) to obtain the 
values of C, w and h in a frame, respectively). To keep F concise, no trajectory 
occurs twice, i.e. 

VF,G £ F.traj(F) ^ traj(G) 

and if a message sequence w = traj ( F ) that has been experienced before occurs 
(describing an entire encounter) under conditions C that are not compatible 
with cond(F) under any circumstances (i.e. cond(C) AC' [= false), F is modified 
to obtain F' = (cond(F) V C' , w, h). 

As a final element in this agent architecture, we assume the existence of a 
utility function 

u : 2 l x A* M 

which will provide to the agent an assessment of the utility u(KB , w) of any mes- 
sage/action sequence w and any knowledge base content KB. Note that while 
it appears to be a rather strong assumption that the utility of any message se- 
quence can be numerically assessed in any state of belief, this is not intended 
as a measure for how good certain messages are in a “social” sense. Rather, it 
suffices if u returns estimates of the “goodness” of physical do-messages with re- 
spect to goal achievement and assigns a small negative utility to all non-physical 
messages that corresponds to the cost incurred by communication. 

Minimal InFFrA agents who construct frame repositories in this way can use 
them to record their interaction experience: In any given situation, they can 
filter out those frames that are irrelevant under current belief and compute 
probabilities for other agents’ actions and for the expectations others towards 
them given their own previous behaviour. They can assess the usability of certain 
frames by consulting their utility function, and they use the trajectories in T 
both to determine the frames that are applicable and to pick their next actions. 




An Empirical Model of Communication in Multiagent Systems 



25 



4 Empirical Semantics 

As mentioned before, the semantic model we want to propose is purely conse- 
quentialist in that it defines the meaning of utterances in terms of their effects. 

Let 2 • H G N be some upper bound on the possible length of encounters, and 
let A(£ h ) be the set of all discrete probability distributions over all words from 
£* no longer than H . 

We define the interpretation Ij? induced by some frame repository T as a 
mapping from knowledge base states and current encounter sequence prefixes to 
the posterior probability distributions over all possible postfixes (conclusions) of 
the encounter. Formally, I? £ (2 L x £ H — > A{£ H )) with 

I^(KB,w) = Xw' .P(w'\w) 

where 

P(w'\w) = a ■ count(F)[\traj (F)\\ 

F £ T, traj(F) = ww' , 

KB (= cond(F) 

for any w, w’ £ £ H and some normalisation constant a. 

This means that, considering those frames only whose conditions hold under 
KB , we compute the ratio of experienced conclusions w' to the already perceived 
prefix encounter w and the number of all potential conclusions to w. 

The intuition behind this definition is that during an interaction encounter, if 
the encounter started with the initial sub-sequence w, the interpretation function 
Ijr will yield a probability distribution over all possible continuations w' that 
may occur in the remainder of the current interaction sequence. 

Finally, given this probability distribution, we can also compute the expected 
“future utility” of any message sequence w by computing 

u(w) = Ijr(KB ■ u{KB' ,w') 

w'e£ H 

if KB' is the state of the knowledge base after w' has occurred 3 . 

The definitions in this section resemble the framework of Markov Decision 
Processes (MDPs) very much, and to capture the fact that probabilities of com- 
munication effects are affected by the decision-making agent herself, the MDP 
model would have to be modified appropriately. For the purposes of the present 
analysis, though, defining some simple measures on expectation structures will 
suffice. 

5 Entropy Measures 

With the above definitions at hand, we can now return to the principles of 
communication laid out in section 2. There, we claimed that an agent strives to 

3 This is because w' might involve actions that change the state of the environment. 
Unfortunately, this definition requires that the agent be able to predict these changes 
to the knowledge base a priori. 
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reduce the uncertainty about others’ communicative behaviour, and at the same 
time to increase his own autonomy. 

We can express these objectives in terms of the expectation entropy EE and 
the utility deviation UD that can be computed as follows: 



EEf(w,KB)= — P(w'\w) log 2 P(w'\w) 

w'££ h 

UDf(w, KB) = (u(w’, KB) - u(w ', KB)) 2 

w'GS h 



Total entropy £^{w, KB) of message sequence w is defined as follows: 



£ r (w, KB) = EE t {w, KB) • UD^{w, KB) 



How can these entropy measures be interpreted? The expectation entropy as- 
sesses the information-theoretic value of having performed/perceived a certain 
sequence w of messages. By computing the information value of all potential 
continuations, EE (again, we drop subscripts and arguments whenever they are 
obvious from the context) expresses the entropy that is induced by w in terms 
of potential continuations of this encounter prefix: the lower EE, the higher the 
value of w with respect to its ability of reducing the uncertainty of upcoming 
messages/actions. Thus, by comparing expectation entropies for different mes- 
sages in the process of selecting which message to utter, the agent can compare 
their values or regard the system of all possible messages as an “encoding” for 
future reactions. 

Utility deviation, on the other hand, is defined as the standard deviation 
between the utilities of all possible continuations of the encounter given w so 
that the importance of the potential consequences of w can be assessed. Its 
power lies in being closely related to the expected utility of the encounter, while 
at the same time providing a measure for the risk associated with the encounter 
sequence perceived so far. 

Returning to the observation we made regarding the “request” nature of any 
communicative action in section 2, we can now rephrase this view in terms of 
the mathematical tools introduced in the above paragraphs: Any message v £ S 
considered in the context of an encounter has an expectation entropy associated 
with it, so that EE(wv, KB) can be used to predict how much using v will help 
to “settle” the communication situation, i.e. to reduce the number of potential 
outcomes of the entire encounter. At the same time UD(wv, KB) can be used 
to check how “grave” the effects of different outcomes would be. 

By combining these two measures into £, the agent can trade off the reduction 
of uncertainty against sustainment of autonomy depending on its willingness to 
conform with existing expectations or to deviate in order to pursue goals that 
contradict the expectations held towards the agent. 
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6 Analysis 

To see how the above framework may help interpret the meaning of utterances 
and guide the agent’s behaviour, we will compare three different interaction 
scenarios, in which the frame repositories of some agent a\ have been compiled 
into the trees shown in figures 1, 2 and 3, respectively (we use trees of interaction 
trajectories as defined in [2] instead of sets of sequences as a more compact 
representation). The nodes which represent messages are connected by edges that 
are labelled with transition probabilities in italics (computed using count(F)). 
We use variables A, B, X etc. to capture several “ground” situations by a single 
tree. The substitutions that are needed to reconstruct past interactions using the 
tree are not displayed in the examples, but form part of the private attributes 
(cf. section 3.1). 

Where the direct utility associated with an action is not zero, the in- 
crease/decrease in total utility is printed on top of the action in bold face in 
square brackets [] (if communication preceding these “utility nodes” comes at a 
cost, this has been already considered in the utility of the leaf node). For sim- 
plicity, we also assume the trees presented here to be the result of combining all 
frames that are consistent with the current knowledge base, i.e. frame conditions 
have already been checked. 

6.1 Interaction Scenarios 

The repository shown in figure 1 summarises experience with a “simple-request” 
protocol (SRP) where one agent starts by requesting an action X and the other 
may simply execute the requested action or end the encounter (the _L symbol is 
used to denote encounter termination whenever termination probability is below 
1.0) - in a sense, this is the most “minimal” protocol one can think of. So far, 
only 30% out of all requests have been fulfilled, all others went unanswered. 



We now picture a situation in which agent ai is requested by agent 02 to 
execute some action, but this action has a utility of —10 for ai. Note that the 
probabilities in the tree are derived from observing different interactions where 
ai may have held both participating parties’ roles in different instances, but the 
utility decrease of 10 units is computed on the grounds of the current situation, 
by instantiating variable values with agent and action names (e.g. A = a 2 , 
B = a± and X = deliver {quantity = 100)). 




[- 10 ] 

0 j y do(B,X) 



Fig. 1 . SRP (simple-request protocol) frame repository tree. 



28 



Michael Rovatsos, Matthias Nickles, and Gerhard Weiss 



[- 10 ] 

do(B,X) 



1.0 0.9 

0.3 accept(B,A,X) — ► confirm! A,B,X) • 

request! A, B,X) ^.7 ’v 

7^ reject^, A, X) 




Fig. 2. RAP (request-accept protocol) frame repository tree. 



0.9 

1 0 j confirm! A,B,X) 



[- 10 ] 

do(B,X) 



0 ji*vu±u.iu\n,u,^.j \ [_5] , 

/ 0.A I do(B,X)( 

accept(B,A,X) 7 X 0.94 

0 3 / accept-proposal(A,B,Y)! ' X 

' °' 5 f 0.l\ 

request(A,B,X) X»;propose!B,A,Y)f X 

\ ’ 0.5\ 

0.5 V reject-proposal(A,B,Y) 

reject(B,A,X) 

Fig. 3. RCOP (request-counter-offer protocol) frame repository tree. 



0.774 



[-5] 

do(A,Y) 



Figure 2 shows a “request-accept” protocol (RAP) that leaves some more op- 
tions to the requestee as he may accept or reject the request. After confirmation 
of the requesting agent (which is certain), the requestee executes the request 
with a probability of 90%; in 10% of the cases, the agent who agreed to fulfil the 
request is unreliable. 

The “request-counter-offer” protocol (RCOP) in figure 3 offers more pos- 
sibilities still: it includes “accept” and “reject” options, but it also allows for 
making a proposal Y that the other agent may accept or reject in turn, and if 
this proposal is accepted, that other agent is expected to execute action Y if 
the first agent executes X. The distribution between accept/propose/reject 
is now 0.3/0. 2/0. 5, because it is realistic to assume that in 20% of the cases in 
which the initial offer would have been rejected in the RAP, the requestee was 
able to propose a compromise in the RCOP. As before, the requestee fails to 
perform X with probability 0.1, and this unreliability is even larger (23%) for 
the other agent. This is realistic, because the second agent is tempted to “cheat” 
once his opponent has done his duty. In the aforementioned scenario, we as- 
sume that the “compromise” actions X and Y (e.g. X = deliver {quantity = 50), 
Y = pay _bonus) both have utility —5.0, i.e. the compromise is not better than 
the original option deliver {quantity = 100). 

Now let us assume ai received the message 



request(a 2 , ai, deliver {quantity = 100)) 
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from (i 2 who starts the encounter. The question that a \ finds herself in is whether 
he should perform the requested action despite the negative utility just for the 
sake of improving the reliability of the frame set or not 4 . 

6.2 Entropy Decrease vs. Utility 

First, consider the case where he chooses to perform the action. In the SRP, 
this would decrease UD( request) from 5.39 to 5.36 5 , but it would increase 
EE (request) from 0.8812 to 0.8895. The total entropy £(request) would in- 
crease from 4.74 to 4.76. In case of not executing the requested action utility 
deviation would rise to 5.40, expectation entropy would decrease to 0.8776, and 
the resulting total entropy would be 4.73. 

How can we interpret these changes? They imply that choosing the more 
probable option _L reduces entropy while performing the action increases it. Thus, 
since most requests go unanswered, doing nothing reassures this expectation. 
Yet, this increases the risk (utility deviation) of request, so aq’s choice should 
depend on whether he thinks it is probable that he will herself be in the position 
of requesting an action from someone else in the future (if e.g., the utility of 
do becomes +10.0 in a future situation and a\ is requesting that action). But 
since the difference in AS (the difference between entropies after and before the 
encounter) is small (0.02 vs. -0.01), the agent should only consider sacrificing 
the immediate payoff if it is highly probable that the roles will be switched in 
the future. 

Let us look at the same situation in the RAP case. The first difference to 
note here is that 

EE(accept) = EE(confirm) = 6.40 > 4.76 = HH(request) 

This nicely illustrates that the “closer” messages are to utility-relevant actions, 
the greater the potential risk, unless occurrence of the utility-relevant action 
is absolutely certain. This means that the 0. 9/0.1 distribution of do/+ consti- 
tutes a greater risk than the 0.7/0. 3 distribution of reject/accept, even though 
EE(confirm) < EE(request)! 

If aq performs the requested action, the total entropy of request increases 
from 4.86 to 4.89, if he doesn’t (by sending a reject), it decreases to 4.84. Since 
this resembles the entropy effects in the SRP very much, what is the advantage 
of having such a protocol that is more complex? 

6.3 External Paths and Path Criticality 

The advantages of the RAP become evident when looking at the entropies of 
accept and confirm after a reject, which remain unaffected (since they are 

4 Ultimately, this depends on the design of the agent, i.e. in which way this reliability 
is integrated in utility computation. 

5 The small changes are due to the fact that the frame repository is the product of 
100 encounters - a single new encounter induces only small changes to the numerical 
values. 
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located on different paths than reject). So RAP is, in a sense, superior to SRP, 
because it does allow for deviating from a certain expectation by deferring the 
expectations partly to messages on unaffected external paths. Effectively this 
means that after a reject, a request becomes riskier in future encounters, but 
if the agent waits until the accept message in a future interaction, he can be 
as certain of the consequences as he was before. Of course, in the long run this 
would render request almost useless, but if used cautiously, this is precisely the 
case where autonomy and predictability can be combined to serve the needs of 
the agents. 

The most dramatic changes to entropy values will be witnessed if the agent 
doesn’t perform the action, but promises to do so by uttering an accept message: 
£(request) increases from 4.86 to 5.05, ^(accept) and £(conf irm) both increase 
from 3.00 to 3.45. This is an example of how our analysis method can provide 
information about path criticality: it shows that the normative content of accept 
is very fragile, both because it is closer to the utility-relevant action and because 
it has been highly reliable so far. 

6.4 Trajectory Entropy Shapes 

Let us now look at the RCOP and, once more, consider the two alternatives 
of executing the request right away or rejecting the request. Now, the total 
entropy decreases from 14.41 to 14.38 and 14.35 in the case of accept/reject, 
respectively. This is similar to the SRP and the RAP, even though the effects of 
different options are now less clearly visible (which due to the fact that refusal 
and acceptance are now more evenly distributed). Also, the total entropy of 
request that is more than three times higher than before (with comparable 
utility values). This suggests that it might be a good idea to split the RCOP 
into two frames that start with different performatives, e.g. request-action 
and request-proposal. 

Of course, the propose option is what is actually interesting about the RCOP, 
and the final step in our analysis will deal with this case. If ai analyses the 
possible runs that include a propose message, he will compare the effects of the 
following encounters on the frame tree with each other: 



Short name 


Encounter 


“success” : 
“A cheats”: 
“B cheats”: 
“rejection”: 


request(A, B, X) . . . 
request(A, B, X) . . . 
request(A, B, X) . . . 
request(A, B, X) — > 


— > do(A, Y) 
do(R, X) 

— > accept -proposal(A, B 1 Y) 
reject-proposal(f3, A , X) 



Figures 4 and 5 show the values of £(w) and A£(w) (the change in total entropy 
before and after the encounter) computed for the messages along the path 

w = propose(A, B , X ) do(A, Y) 



A first thing to note is the shape of the entropy curve in figure 4 which is typical 
of meaningful trajectories. As illustrated by the boxed “perfect” entropy curve, 
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Message 



Fig. 4. RCOP entropies along “success” path for all four interaction cases. 



reasonable trajectories should start with an “autonomy” part with high entropy 
which gives agents several choices, and then continue with a “commitment” part 
in which entropy decreases rapidly to make sure there is little uncertainty in the 
consequences of the interaction further on. 

Secondly, figure 5 which shows the changes to the node entropies before 
and after the respective interaction proves that as in the RAP, cheating has a 
negative impact on entropies. Moreover, the effects of 11 A cheats” appear to be 
much worse than those of “ B cheats” which reassures our intuition that the 
closer utterances are to the final outcome of the encounter, the more critical will 
the expectations about them be. 

Thirdly, as before, the “rejection” dialogue and the “success” dialogue are 
acceptable in the sense of decreasing entropies of propose and accept-proposal 
(note that the small entropy increase of request is due to the 0.1/0.23 probabil- 
ities of cheating after accept-proposal and do(B, X)). The fact that “success” 
is even better than “rejection” suggests that, in a situation like this, there is 
considerable incentive to compromise, if the agent is willing to sacrifice current 
payoff for low future entropies. 

6.5 Conflict Potential 

Looking at the plots in figure 5, a more general property of communication 
becomes evident: we can imagine an agent reckoning what to do in an ongoing 
encounter who evaluates the potential entropy changes to relevant paths after 
each message. 

For this purpose, let T' be the result of adding a new encounter w' to the cur- 
rent repository T (we assume count (w) and cond(w) are computed as described 
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Message 



Fig. 5. RCOP entropy changes AS along propose(A, B , A') do(A, Y). 



in section 3). The entropy change induced on trajectory w £ E* by performing 
encounter w' £ S* is defined as 

A£^(w,w') = £r'{w) - £^{w) 

This quantity provides a measure of the expectation- affirmative or expectation- 
negating character of an utterance. In other words, it expresses to which degree 
the agents are saying “yes” or “no” to an existing expectation. 

The conflict potential of an encounter can be derived by comparing the ex- 
pected entropy change to the occurred entropy change, and thus revealing to 
which degree the agents exceeded the expected change to expectation struc- 
tures. We can define the conflict potential exerted by the occurred encounter w" 
on encounter w if the expected encounter was w' as 

MM] 

CVjr(w " , w ' , w) = A£jr(w, w") — A£f(w 7 w')du\ 

w[l\ 

This is the area under the “conflict curve” in figure 5, that computes 

A£( “success”, “A cheats”) — A£ (“success”, “success”) 

This curve shows how the difference between expected and actual entropy change 
grows larger and larger, until the encounter is terminated unsuccessfully. This 
increases the probability that the participating agents will stop trusting the 
expectation structures, and that this will inhibit the normal flow of interaction, 
especially if CV is large for several paths w. 
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A noteworthy property of this view of conflict is that in cases where, for 
example, entirely new performatives are tried out, the conflict potential is 0 
because the expected entropy change (which is very large, because the agents 
know nothing about the consequences of the new performative) is identical to 
that actually experienced. So what matters about conflict is not whether the 
expectations associated with a message are clear, but rather whether the ef- 
fect of uttering them comes close to our expectations about that effect on the 
expectation structures - a property we might call second-order expectability. 



7 Conclusions 

This paper presented a novel model for defining and analysing the semantics of 
agent communication that is radically empirical, consequentialist and construc- 
tivist. Essentially, it is based on the idea that the meaning of communication 
lies in the expectations associated with communicative actions in terms of their 
consequences. This expectations always depend on the perspective of an observer 
who has derived them from his own interaction experience. 

By relying on a simple statistical analysis of observed communication that 
makes no domain-dependent assumptions, the proposed model is very generic. It 
does impose some restrictions on the design of the agents by assuming them to be 
capable of recording and statistically analysing observed interaction sequences. 

A common critique of such “functionalist” semantics of agent communica- 
tion that has to be taken seriously is that there is more to communication than 
statistical correlations between messages and actions, because the purpose of 
communication is not always physical action (but also, e.g., exchange of infor- 
mation) and that many (in particular, normative) aspects of communication are 
neglected by reducing semantics to an empirical view. We still believe that such 
empirical semantics can serve as a “greatest common denominator” for diver- 
gent semantic models of different agents, if no other reliable knowledge about 
the meaning of messages is available. If, on the other hand, such knowledge 
is available, our framework can still be used “on top” of other (mentalistic, 
commitment-based) semantics. 

Using very general entropy-based measures for probabilistic expectation 
structures, we performed an analysis of different empirically observed interaction 
patterns. This analysis proved that useful expectation structures are structures 
that leave enough room for autonomy but are at the same time reliable once 
certain paths are chosen by interactants - they are autonomy-respecting and 
contingency-reducing at the same time. 

Such structures are characterised by the following features: 

- external paths whose entropies remain fairly unaffected by agent’s choices in 
the early phases of an encounter; 

low expectation entropy where utility deviation is high - the higher the po- 
tential loss or gain of a path, the more predictable it should be (esp. towards 
the end of an encounter); 
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- alternatives for different utility configurations; paths that are likely to have 
a wider range of acceptable outcomes for the partners (e.g. by containing do- 
actions for all parties, cf. RCOP) are more likely to become stable interaction 
procedures, as they will be used more often. 

One of the strengths of our framework is that empirical semantics suggest in- 
cluding considerations regarding the usefulness of “having” a certain semantics 
in the utility-guided decision processes of agents. Agents can compute entropy 
measures of message trajectories prior to engaging in actual communication and 
assess the first- and second-order effects of their actions under current utility 
conditions or using some long-term estimate of how the utility function might 
change (i.e. which messages they will want to be reliable in the future). The 
fact that agents consider themselves being in the position of someone else (when 
computing entropy changes) links the protocol character of communication to 
the self-interested decision-making processes of the participating agents, thus 
making communication truly meaningful. 

As yet, we have not formalised an integrated decision-theoretic framework 
that allows these long-term considerations to be included in social “message- 
to-message” reasoning, but our model clearly provides a foundation for further 
investigation in this direction. Also, we have not yet dealt with the question 
of how an agent can optimally explore existing communicative conventions in a 
society so as to obtain a good expectation model as quickly as possible and how 
to balance this exploration with the exploitation in the sense of utility maximisa- 
tion. Our use of information-theoretic measures suggests that information value 
considerations might be useful with this respect. 

Another novelty of our framework is the definition of conflict potential as 
a “decrease in trust towards the communication system”. Sudden, unexpected 
“jumps” in entropies that become bigger and bigger render the expectation struc- 
tures questionable, the meaning of communicative acts becomes more and more 
ambiguous. This definition of computational conflict is very powerful because 
it does not resort to domain-dependent resource or goal configurations and is 
defined solely in terms of communicative processes. However, we have not yet 
suggested resolution mechanisms for such conflict interactions. We believe that 
reifying conflict in communication (i.e. making it the subject of communica- 
tion) is of paramount importance when it comes to conflict resolution and are 
currently working in this direction. 

Future work will also focus on observing and influencing evolving expecta- 
tions at different levels. In [13], we have recently proposed the formalism of 
expectation networks that is suitable for constructing large-scale “communica- 
tion systems” from observation of an entire MAS (or at least of large agent 
sub-populations) through a global system observer. There, we have also anal- 
ysed to which degree the frame repositories of locally reasoning InFFrA agents 
can be converted to global expectation networks and vice versa. This paves the 
way for developing methods that combine (i) a priori designer expectations that 
are represented through expectation structures [2] with (ii) emergent global com- 
munication patterns at the system level and (iii) agent rationality, creativity and 
adaptation through strategic application of communicative expectations. 
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We strongly believe that this approach has the capacity to unify the different 
levels addressed in the process of engineering open MAS via the concept of 
empirical semantics. The material presented in this paper can be seen as a first 
step in this direction. 
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Abstract. The cognitive coherence theory for agent communication 
pragmatics allows modelling a great number of agent communication 
aspects while being computational. This paper describes our exploration 
in applying the cognitive coherence pragmatics theory for BDI agents 
communication. The presented practical framework rely on our dialogue 
games based agent communication language (DIAGAL) and our dialogue 
game simulator toolbox (DCS). It provides the necessary theoretical and 
practical elements for implementing the theory as a new layer over classi- 
cal BDI agents. In doing so, it brought a general scheme for automatizing 
agents’ communicational behavior. Finally, we give an example of the re- 
sulting system execution. 



1 Introduction 

Agents and multi-agents technologies allow the conception and development of 
complex applications. In the current distributed data processing paradigm, the 
fundamental characteristic of these systems is the agents skill in communicating 
with each other in a useful way regarding to their individual and collective goals. 
If numerous works have aimed to define agents communication languages, few 
have concentrated on their dynamic and automatic use by agents. This last task 
is left to the system designers, who analyse and specify manually the agent 
communicational behavior, usually by means of rules or by designing ad hoc 
protocols and static procedures to use them. In this paper, we introduce our 
investigation toward a theoretical and practical framework for the pragmatics of 
agent communication, i.e. the automation of agents’ communicational behaviors. 

In this paper, we first summarize our approach for agent communications 
pragmatics, the cognitive coherence theory (section 2). This conceptual frame- 
work is based on a unification of the cognitive dissonance theory which is one 
of main motivational theories in social psychology and Tlragard’s philosophy of 
mind theory: the coherence theory. After detailing our dialogue games based 
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agent communication language (DIAGAL, section 3), we briefly present our dia- 
logue game simulator (DGS, section 4), a practical framework to experience dia- 
logue games. We indicate then, how our coherence pragmatics approach was im- 
plemented to automate conversations using DIAGAL games, among BDI agents 
(section 5). Finally, we give an example of automatic conversation between agents 
to illustrate our “complete” automatic communication framework (section 6). 

2 Dialogue Pragmatics 

2.1 The Cognitive Coherence Framework 

In cognitive sciences, cognitions gather all cognitive elements: perceptions, 
propositional attitudes such as beliefs, desires and intentions, feelings and emo- 
tional constituents as well as social commitments. From the set of all cognitions 
result attitudes which are positive or negative psychological dispositions towards 
a concrete or abstract object or behavior. All attitudes theories, also called cog- 
nitive coherence theories appeal to the concept of homeostasis, i.e. the human 
faculty to maintain or restore some physiological or psychological constants de- 
spite the outside environment variations. All these theories share as a premise 
the coherence principle which puts coherence as the main organizing mecha- 
nism: the individual is more satisfied with coherence than with incoherence. The 
individual forms an opened system whose purpose is to maintain coherence as 
much as possible (one also speaks about balance or about equilibrium) . Attitude 
changes result from this principle in incoherence cases. 

Our pragmatics theory follows from those principles by unifying and extend- 
ing the cognitive dissonance theory, initially presented in 1957 by Festinger [6] 
with the coherence in thought and action theory of the computational philoso- 
pher Thagard [20] . This last theory allows us to directly link the cognitive disso- 
nance theory with notions, common in AI and MAS, of elements and constraints. 

In our theory, elements are both private and public agent’s cognitions: beliefs, 
desires, intentions and social commitments. Elements are divided in two sets: 
the set A of accepted elements (which are interpreted as true, activated or valid 
according to the elements type) and the set R of rejected elements (which are 
interpreted as false, inactivated or not valid according to the type of elements). 
Every non-explicitly accepted element is rejected. Two types of non-ordered 
binary constraints on these elements are inferred from the pre-existing relations 
that hold between them in the agent’s cognitive model: 

— Positive constraints', positive constraints are inferred from positive relations 
which can be: explanation relations, deduction relations, facilitation relations 
and all other positive associations considered. 

— Negative constraints', negative constraints are inferred from negative rela- 
tions: mutual exclusion, incompatibility, inconsistency and all the negative 
relations considered. 

For each of these constraints a weight reflecting the importance and validity 
degree for the underlying relation is attributed. These constraints can be sat- 
isfied or not: a positive constraint is satisfied if and only if the two elements 
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that it binds are both accepted or both rejected. On the contrary, a negative 
constraint is satisfied if and only if one of the two elements that it binds is 
accepted and the other one rejected. So, two elements are said to be coherent 
if they are connected by a relation to which a satisfied constraint corresponds. 
And conversely, two elements are said to be incoherent if and only if they are 
connected by a relation to which a non-satisfied constraint corresponds. Given a 
partition of elements among A and R , one can measure the coherence degree of 
a non-empty set of elements by adding the weights of constraints connected to 
this set (the constraints of which at least a pole is an element of the considered 
set) which are satisfied divided by the total number of concerned constraints. 
Symmetrically, the incoherence of a set of cognitions can be measured by adding 
the weights of non-satisfied constraints concerned with this set and dividing by 
the total number of concerned constraints. 

In this frame, the basic hypothesis of the cognitive dissonance theory is that 
incoherence (what Festinger names dissonance [6]) produces for the agent a ten- 
sion which incites him to change. The more intense the incoherence, the stronger 
are the insatisfaction and the motivation to reduce it. A cognition incoherence 
degree can be reduced by: (1) abolishing or reducing the importance of incoher- 
ent cognitions (2) adding or increasing the importance of coherent cognitions. 

Festinger’s second hypothesis is that in case of incoherence, the individual is 
not only going to change his cognitions or to try to change others’s ones to try 
to reduce it, he is also going to avoid all the situations which risk increasing it. 
Those two hypotheses were verified by a large amount of cognitive and social 
psychology studies and experiences [25]. 

One of the major advantages of the cognitive dissonance theory captured by 
our formulation is to supply incoherence (that is dissonance in Festinger’s termi- 
nology) measures, i.e. a metric for cognitive coherence. This metric is available 
at every level of the system: for a cognitive element, for a set of elements, for 
an agent, for a group of agents or even for the whole MAS system. These mea- 
sures match exactly the dissonance intensity measures first defined by Festinger 
(because a dissonance link in Festinger’s model corresponds to a non-satisfied 
constraint in Thagard’s model and a consonance link corresponds to a satisfied 
constraint). Festinger’s hypothesis, along with those measures give us a very 
general motivational scheme for our agents. 

2.2 Dialogue as Coherence Seeking 

As we argue elsewhere [12, 13], using coherence as a motivational motor allows 
us to model a great number of expected features for dialogue pragmatics. In 
particular, it allows us to answer (even partially) the following questions: 

1. Why agents should dialogue? Agents dialogue in order to reduce incoherences 
they cannot reduce alone. We distinguish internal (or personal) incoherence 
from external (or collective) incoherence depending on whose elements are 
involved in the incoherence 1 . 

1 In the presented system, external elements are social commitments. 
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2. When should an agent take a dialogue initiative, on which subject and with 
whom? An agent engages in a dialogue when an incoherence magnitude 
exceeds a fixed level 2 and he cannot reduce it alone. Whether because it is 
an external incoherence and he cannot accept or reject external cognitions on 
his own, or because it is an internal incoherence he fails to reduce alone. The 
subject of this dialogue should thus focus on the elements which constitute 
the incoherence. The dialogue partners are the other agents involved in the 
incoherence if it is an external one or an agent he thinks could help him in 
the case of a merely internal incoherence. 

3. By which type of dialogue? Even if we gave a general mapping of incoherence 
types toward dialogue types [13], the theory is generic enough for being ap- 
plied to any conventional communicational framework. Hereafter (section 5), 
we gave the procedural scheme for this choice using DIAGAL dialogue games 
as primitive dialogue types. 

4. How to define and measure the utility of a conversation? As we state in [12], 
following the coherence principle and the classical definition of utility func- 
tions, the utility of a dialogue is the difference between the incoherence before 
and after this dialogue. Furthermore, we define the expected utility of a dia- 
logue as the incoherence reduction in case of success of the dialogue, i.e. the 
expected dialogue results are reached. As dialogues are attempts to reduce 
incoherence, expected utility is used to choose between different competing 
dialogues types (dialogue games in our case). 

5. When to stop dialogue or else, how to pursue it? The dialogue stops when 
the incoherence is reduced or else either it continues with a structuration 
according to the incoherence reductions chain or it stops because things 
cannot be re-discussed anymore (this case where incoherence persists often 
leads to attitude change as described in section 5). 

6. What are the impacts of the dialogue on agents’ private cognitions? In cases 
where dialogue, considered as an attempt to reduce an incoherence by work- 
ing on the external world, definitively fails, the agent reduces the incoherence 
by changing his attitudes in order to recover coherence (this is the attitude 
change process described in section 5). 

7. Which intensity to give to illocutionary forces of dialogue acts? Evidently, 
the intensities of the illocutionary forces of dialogue/speech acts generated 
are influenced 3 by the incoherence magnitude. The more important the in- 
coherence magnitude is, the more intense the illocutionary forces are. 

8. What are the impacts of the dialogue on agents’ mood? The general scheme 
is that: following the coherence principle, coherence is a source of satisfaction 
and incoherence is a source of dissatisfaction. We decline emotional attitudes 
from internal coherence dynamic (happiness arises from successful reduction, 
sadness from failed attempt of reduction, fear from a future important re- 
duction attempt, stress and anxiety from an incoherence persistence,. . . ). 

2 This level or a “Should I dialogue?” function allows us to model different strategies 
of dialogue initiative. 

3 Actually, this is not the only factor, as we exemplify elsewhere, other factors could 
also matter: social role, hierarchical positions,. . . 
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9. What are the consequences of the dialogue on social relations between agents? 
Since agents can compute and store dialogue utility, they can build and 
modify their relations with others agents in regard to their past dialogues. 
For example, they can strengthen relations with agents with whom past 
dialogues were efficient and useful, according to their utility measures, . . . 

All those dimensions of our theory - except 7, 8 and 9 - will be exemplified 
in section 6. But before implementing our pragmatics theory, we need an agent 
communication language. 



3 A Dialogue Game Language Based on Commitments: 
DIAGAL 

DIAGAL[DIAlogue Games Agent Language] is our commitment-based agent 
language in which we define semantics of the communicative acts in terms of 
public notions, e.g. social commitments [3]. The use of those public cognitions 
allows us to overcome classical difficulties of “intentional” agent communication 
approaches: the sincerity hypothesis does not hold anymore and the semantic 
verification problem is solved (see [14] for explanations). 

3.1 Social Commitments 

As our approach is based on commitments, we start with some details about the 
notion of commitment. The notion of commitment is a social one, and should 
not be confused with the notion of individual commitment used to emphasize 
individual intention persistance. Conceptually, social commitments model the 
obligations agents contract toward one another. Crucially, commitments are ori- 
ented responsibilities contracted towards a partner or a group. In the line of [24], 
we distinguish action commitments from propositional commitments. 

Commitments are expressed as predicates with an arity of 6. An accepted 
action commitment thus take the form: 

C(x,y,a,t,s x ,s y ) 

meaning that x is committed towards y to a at time t, under the sanctions s x 
and s y . The first sanction specifies conditions under which x reneges its com- 
mitment, and the second specifies conditions under which y can withdraw from 
the considered commitment. Those sanctions 4 could be social sanctions (trust, 
reputation,. . . ) as well as material sanctions (economical sanctions, repairing ac- 
tions, . . . ). An accepted propositional commitment would be have propositional 
content p instead a. Rejected commitments take the form -<C(x,y, a,t, s x , s y ) 
meaning that x is not committed toward y to a 

This notation for commitments is inspired from [18], and allows us to com- 
pose the actions or propositions involved in the commitments: ai|a 2 classically 

4 Since we did not investigate a whole agent architecture in this paper, we leave 
sanctions as a realistic conceptual abstraction. 
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stands for the choice, and oq => for the conditional statement that «2 will 
occur in case of the occurrence of the event oq. Finally, the operations on the 
commitments are just creation and cancellation. 

Now, we need to describe the mechanism by which the commitments are 
discussed and created during the dialogue. This mechanism is precisely modelled 
within our game structure. To account for the fact that some commitments are 
established within the contexts of some games and only make sense within this 
context [9,11], we make explicit the fact that those dialogical commitments are 
particular to game g (by indicating g as a subscript). This will typically be the 
case of the dialogue rules involved in the games, as we will see below. 

3.2 Game Structure 

We share with others [4, 7, 11] the view of dialogue games as structures regulat- 
ing the mechanism under which some commitments are discussed through the 
dialogue. Unlike [4, 11] however, we adopt a strict commitment-based approach 
within game structure and express the dialogue rules in terms of dialogical com- 
mitments. Unlike [7] on the other hand, we consider different ways to combine 
the structures of the games. 

In our approach, games are considered as bilateral structures defined by: 

— entry conditions, (E): conditions which must be fulfilled at the beginning of 
the game, possibly by some accommodation mechanism; 

— success conditions, ( S ): conditions defining the goal of the initiator partici- 
pant when engaged in the game; 

— failure conditions, ( F ): conditions under which the initiator can consider 
that the game reached a state of failure; 

— dialogue rules , ( R ): rules specifying what the conversing agents are “dialog- 
ically” committed to do. 

As previously explained, all these notions, even dialogue rules, are defined in 
terms of (possibly conditional, possibly dialogical) commitments. Within games, 
conversational actions are time-stamped as “turns” (<o being the first turn of 
dialogue within this game, tf the last). 

3.3 Grounding and Composing the Games 

The specific question of how games are grounded through the dialogue is cer- 
tainly one of the most delicate [10]. Following [16], we assume that the agents 
can use some meta-acts of dialogue to handle games structure and thus propose 
to enter in a game, propose to quit the game, and so on. Games can have dif- 
ferent status: they can be open, closed, or simply proposed. How this status is 
discussed in practice is described in a contextualization game which regulates 
this meta-level communication. Figure 1 indicates the current contextualisation 
moves and their effects in terms of commitments. For example, when a proposi- 
tion to enter a game j ( prop.infx , y,j)) is played by the agent x, y is committed 
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Move 


Operations 


prop.in(x,y,j) 


create(y, Cj (y, x, acc.in(y, x, j) 
\ref.in(y, x, j)\prop.in{y, x, j 1 ))) 


prop.out(x,y,j) 


create(y, Cj (y, x, acc.out(y, x,j) 
| ref.out(y,x,j))) 


acc.in(x,y, j) 


create dialogical commitments for game j 


acc.out(x,y,j) 


suppress dialogical commitments for game j 


ref.in(x,y,j) 


no effect on the public layer 


ref.out(x, y,j) 


no effect on the public layer 



Fig. 1. DIAGAL contextualisation game. 



to accept ( acc.in ), to refuse ( acc.in ) or to propose entering another game j' 
( prop.in(y, x, j ')), which would lead to a presequencing type of dialogue games 
structuration. 

Concerning the possibility of combining the games, the seminal work of [24] 
and the follow-up formalisation of [16] have focused on the classical notions of 
embedding and sequencing. Even if, recent works, including ours, extend this to 
other combinations [11, 3], in our present simulation framework, we only consider 
the three games’ compositions allowed by the previous contextualisation game. 

— Sequencing noted g\\ <72, which means that 52 is proposed after the termina- 
tion of g\. 

— Pre-sequencing noted 32 51, which means that 32 is opened while <71 is 

proposed. Pre-sequencing is used to establish, to enable some of g 1 entry 
conditions or to explicit some information prior to the entrance in g\. 

— Embedding noted g\ < (72, which means that g\ is opened while <72 was 
already opened. 

A game stack captures that commitments of the embedded games are con- 
sidered as having priority over those of the embedding game. 

3.4 Basic Games 

Up to now we have introduced four basic building dialogue games, which are 
exactly those which lead (in case of success) to the four types of commitments 
which can hold between two agents X and Y , namely: 

1. for an attempt to have an action commitment from Y toward X accepted, 
agent X can use a “request” game (rg); 

2. for an attempt to have an action commitment from X toward Y accepted, 
agent X can use an “offer” game (og); 

3. for an attempt to have a propositional commitment from X toward Y ac- 
cepted, agent X can use an “inform” game (ig); 

4. for an attempt to have a propositional commitment from Y toward X ac- 
cepted, agent X can use an “ask” game ( ag ). 

Next subsections detail those four games. Sanctions were omitted in our 
games specifications just for better readability. 
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Request Game ( rg ). This game captures the idea that the initiator (x) “re- 
quest” the partner (y) for an action a and this latter can “accept” or “reject”. 
The conditions and rules are: 



K 

Lj rg 

S rg 

TT 

r rg 

R r g 



~^C(y,x,a,t 0 ) and -<C(y,x,-<a,to) 
C(y,x,a,t f ) 

~'C(y,x,a,t f ) 

C g {x, y, request (x, y, a), t 0 ) 

C g (y, x, request(x, y, a ) => 

C g (y, x, accept (y, x, a)| refuse (y, x, a),h),t 0 ) 
C g (y, x, accept (y, x, a) => C(y, x, a, t 2 ),t 0 ) 
C g (y,x,refuse(y,x,a) => -iC(y,x,a,t 2 ),t 0 ) 



Offer Game ( og ). An offer is a promise that is conditional upon the partner’s 
acceptance. To make an offer is to put something forward for another’s choice 
(of acceptance or refusal). To offer then, is to perform a conditional commissive. 
Precisely, to offer a is to perform a commissive under the condition that the 
partner accept a. Conditions and rules are in this case: 



E„ 



F 0 

R, 



og 



~^C(x, y, a, to) and -<C(x,y,-<a,to) 
C(x,y,a,t f ) 

-i C(x,y,a,t f ) 

C g {x,y,offer(x,y,a),t 0 ) 
C g (y,x,offer(x,y,a ) => 

C g (y, x , accept(y, x, a)| refuse(y, x, a),h ),t 0 ) 
C g (x, y , accept{y , x, a) => C(x, y, a, t 2 ),t 0 ) 
C g (x,y,refuse(y,x,a) => ->C(x,y,a,t 2 ),t 0 ) 



Inform Game ( ig ). Notice that a human partner can be disposed to be in 
accord or agreement with someone without uttering any word. He can also agree 
by doing an explicit speech act. In this case - required for agents since they do 
not support implicit communication - the partner can agree or disagree. The 
conditions and rules for this couple is the following: 



E. 

Si, 

Fu 

Ri , 



*3 



-• C(x,y,p,t 0 ) and -iC(x,y,->p,t 0 ) 
C(x,y,p,t f ) 

- , C(x,y,p,t f ) 

C g {x,y , assert (x, y,p),to) 

C g (y,x, assert (x,y,p) => 

C g (y, x, agree(y, x, p) {disagree (y, x, p),ti),t 0 ) 
C g (x,y, agree(y,x,p) =4- C(x,y,p,t 1 ),t 0 ) 
C g (y,x, disagree(y,x,p) => ->C(x,y,p,t 2 ),t 0 ) 



Ask Game ( ag ). We use “ask” in the sense of asking a closed question, which 
consists of requesting the partner to agree or disagree with a proposition p. 
According to these remarks, we propose the following structure for the ask game: 
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E a 

S, 



ag 



ag 



R 



ag\ 



-< C(y,x,p,tf ) and ->C(y,x,->p,tf) 
C{y,x,p,t f ) 

A C(y,x,p,t f ) 

C g (x,y, question(x,y,p),t 0 ) 

C g (y,x, question(x,y,p) => 

C g (y, agree (y , x , p) \ disagree (y,x,p),t i ) , t 0 ) 
C g (y,x, agree (y, x,p) => C(y,x,p,t 2 ),t 0 ) 
C g (y,x, disagree (y,x,p) => ->C(y,x,p,t 2 ),to) 



Notice that in those games, the included speech acts are labelled with a 
relative integer (not shown on the Figures) indicating the illocutionary force 
intensity degree relatively to the default basic illocutionary force degree. For 
example, in the request game the request stands for the directive category for 
action which is mapped to: suggest: -2, direct: -1, request: 0, demand: 1, order: 
2. Allowing agents to use the appropriate illocutionary forces intensity degree 
for each dialogue/speech act leads to many variations of those basic games. 



4 The Dialogue Game Simulator 

We have developed a toolbox, the dialogue game simulator, in order to simulate 
and visualize games-based dialogue as presented in the previous section while 
allowing the integration of some future concepts. The dialogue games simulator 
(DGS) aims to be an effective tool for games testing and validation as well as a 
means of exploring different agent architectures concerning dialogue pragmatics. 
DGS main interface allows managing connected agents, loading dialogue games 
and visualizing synthetic dialogue diagrams. DGS was developed in JAVA using 
JACK™agent technology [8]. In this section, we briefly present the various 
components of DGS. 

4.1 Game Files 

As mentioned previously, a game is composed of entry conditions, success con- 
ditions, failure conditions and rules. In DGS, each of these game components is 
defined in its own file, adding to the possible information re-use while facilitating 
the maintainability of the files. All those files are written in XML. Using XML 
has the advantage of being easily manageable in liaison with JAVA while offering 
a good way of describing information. The DTD (Document Type Definition), 
associated with XML files, describes the precise way in which the game designer 
must create his files. That gives designers and users a mean of knowing if a game 
conforms to the specifications and if it is manageable by the simulator. 

The games are loaded when the simulator starts and are placed in a list where 
agents can charge them when connecting. 

4.2 Agenda and Dialogue Manager 

The agenda and dialogue manager are the principal tools provided by DGS. 
Those tools should be included/embedded in all agents who aim to use loaded 
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DIAGAL Dialogue Games. The agenda is a kind of individual “commitment 
store” where commitments are classified according to the time they were con- 
tracted. This structure contains commitments in action and propositional com- 
mitments that hold as well as dialogical commitments in action deduced from 
the current dialogue game(s) rules. Each agent has his own agenda which does 
not contain commitments of all agents which are connected to the simulator, but 
only those for which he is debtor or creditor. 

The agenda is managed by the agent’s dialogue manager module which adds 
or removes commitments according to current dialogue games rules and exter- 
nal events. A commitment in action is fulfilled when an action (perceived as 
an external event) that corresponds exactly to its description occurs. The dia- 
logue manager also checks that every agent’s operations conforms to the current 
contextualisation and opened dialogue games. 

4.3 Action Board and Game Stack 

The action board stores the actions which were played during simulation. It 
is modelled as an UML sequence diagram. Each workspace has its own action 
board where users can observe the exchanges of messages between agents as well 
as the time which is attached to these actions. It is represented as a history of 
the actions carried out relating to each initiated dialogue. The action board aims 
to help the simulator user understand and analyze what occurred in a dialogue 
between two agents. 

The game stack is a common structure used by dialogue managers of con- 
versing agents to keep track of the embedded games during a conversation. Each 
time a new game is opened, it is placed on the top of the stack inside the related 
workspace and it becomes the current game of this workspace. The stack makes 
it possible to know which game will become active when the top one will be 
closed and withdrawn from the stack. This stack is also used to manage the pri- 
ority between the games: the top element having more priority over the bottom 
element. 



4.4 Dialogue Workspace 

The dialogue workspace is an environment which contains all the data which are 
specific to a dialogue between two agents: games stack, actions board and some 
information about hierarchical relations between conversing agents. 

In Figure 2, we present a simplified overview of the DGS framework. This 
figure presents two agents interacting through a dialogue workspace. They com- 
municate by sending each other messages (communicative actions) and as such 
messages are produced, the simulator places them into the actions board. In 
accordance with the current game on the game stack, the dialogue managers of 
the sender and receiver agents deduce the appropriate commitments from the 
game files and places them into their agendas. 

In its current form, DGS allows simulating conversations between pairs of 
software agents (three agents resulting in three pairs). The next section focuses 
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Fig. 2. Simulator overview. 



on our first attempt to implement the coherence theory for automatizing di- 
alogues between BDI agents. Those dialogues would take place in the DGS 
framework using precisely DIAGAL dialogue games presented in the previous 
sections. 

5 Integrating Coherence Theory to BDI Agents 

5.1 Linking Private and Social Cognitions 

In this section, we describe our first attempt at the complex task of integrating 
the concepts of the coherence theory in BDI agents practical reasoning. More 
precisely, we implemented our coherence pragmatics as a new layer above the 
existing BDI architectures. Since we do not propose a whole coherentist approach 
for agent modelling, we will have to extend the classical BDI framework so that 
it can fit with our approach. In particular, traditional BDI frameworks do not 
involve social commitments treatments. 

Choosing a conventional approach for agent communication leads us to ex- 
tend the intentional paradigm for agent practical reasoning issued from rational 
interaction theories: cognitive agent should not reason solely about his and others 
intentions, he should also reason about potential and already existing social com- 
mitments (coming from held dialogues or system’s conventions) . In order to use 
our pragmatics theory to automatize the communication level of the traditional 
BDI abstract architecture, we need to connect private cognitions (mental states) 
with public ones (social commitments). 
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Prior to those links, we assume that our intentional layer is filtered from the 
BDI agent’s whole intentions set. We assume that the intention we receive are 
either social individual intentions or failed individual intentions 5 . Social individ- 
ual intentions are intentions concerning goals which require social aspects to be 
worked on. For example, an employee which has an intention about something 
his boss would be responsible for would have to make some social commitments 
socially accepted before achieving it. More generally, any intention that is em- 
bedded in a somewhat collective activity would have to be a social individual 
intention except if it is part of an already socially accepted collective plan. Those 
social intentions are intentions about a (even indirectly) collective state of affairs 
indicating that those intentions will be part of an external incoherence. Finally, 
individual intentions concerning goals which do not match any individual plan 
or whose associated plan failed could be included in this layer (this matches 
the case where the agent faces an internal incoherence he cannot reduce alone). 
This phase of identifying intentions which could have a social impact appears 
to be crucial for integrating conventional approaches to existing cognitive agent 
architectures. 

In this context, we can return to the general question: what are the links 
between social commitments and private mental states? As a first answer, we 
propose linking private and public cognitions as follows 6 : 

— According to the classic practical reasoning scheme, private cognitions finally 
end in intentions through deliberation and we make the usual distinction 
between intention to (do something or make someone do something) and 
intention that (a proposition holds) [1]; 

— Regarding public cognitions, we distinguish commitments in action from 
propositional commitments [24]; 

— An accepted commitment is the socially accepted counterpart of an inten- 
tion, commitments in action are the counterparts of “intentions to” and 
propositional commitments are the counterparts of “intentions that” . 

Those relations are not completely new since many authors have already 
considered individual intentions as a special kind of individual commitment [1, 
23]. Our links extend this to reach the social level in the appropriate cases (social 
individual intentions or failed individual intentions). Constraints between the 
intentional private layer and the social commitments layer would be inferred 
from those links as well as any other logical links between intentions and social 
commitments. 

5.2 BDI Formulation of the Attitude Change Process 

In our model, any agent tries to maximize his coherence, i.e. tries to reduce his 
incoherences beginning with the most intense one. To reduce an incoherence, the 

5 With the “individual” qualifier in both, we mean that we do not refer to notions 
of we-intention or collective intentions such as those developed by Searle [17] or 
Tuomela [22]. Here, intentions are classical private intentions. 

6 Although, we give a first account here, much more work should be done on this 
point. 
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agent has to accept or reject cognitions to better satisfy the constraints which 
connect them. These cognitions can be private or public. To be able to integrate 
communication into our model, it is now necessary to introduce the fundamental 
link which exists between our formulation of the cognitive dissonance theory and 
the notion of resistance to change. 

All the cognitions are not equally modifiable. This is what Festinger names 
the resistance to change of cognitions. The resistance to change of a cognition 
is a function of the number and the importance of the elements with which 
it is coherent, also depending on its type, age, as well as the way by which it 
was acquired: perception, reasoning or communication. Social commitments are 
particular cognitions which are not individually modifiable but must be socially 
established and dialogue games are tools for attempting to establish collectively 
accepted commitments. That is, in order to get a social commitment accepted, 
an agent has to have a dialogue. Dialogues are the only means for agents to try to 
establish social commitments coherent with their private cognitions. However, 
after those dialogues, some commitments can remain incoherent with private 
intentions. 

After any dialogue game, the discussed commitment is either accepted or 
rejected. As we saw before, an accepted commitment is not modifiable anymore 
without facing the associated sanctions. And we assume that a discussed com- 
mitment which is still rejected will gain in resistance to change. The point here is 
that an agent could not make attempts to have the desired commitment accepted 
indefinitely. 

This resistance to change and associated sanctions would partially forbid 
the agent to gain coherence by changing the commitment acceptance state. We 
could simplify by saying that the discussed commitments usually stand for so- 
cial obligations and fix one of the poles of the constraints which are connected 
to them. To reduce possible incoherence while conforming to discussed commit- 
ments, agents should then change their private cognitions to restore the coher- 
ence. This is the spring of the attitude change in our system and it formalizes 
the vision of the psychologists Brelrm and Cohen on this subject [2], supported 
by a great number of experiments. 

In the present simplified framework, the only private cognitions we consider 
are the intentions, but we assume that the underlying BDI layer would spread 
the attitude change among all the private cognitions. An example of this attitude 
change mechanism is supplied in section 6. 

In MAS, knowing when an agent should try to modify the environment (the 
public social commitments layer, among others) to satisfy his intentions, and 
when the agent has to modify his mental states to be coherent with his environ- 
ment is a crucial question. In practical reasoning, this question take the form: 
when an agent should reconsider his intention and deliberate again and when 
should he persist in acting in the previous deliberated way? As we have just 
seen, within our approach, agents face the same problem and different strategies 
toward the modification of already discussed commitments (including reasoning 
about sanctions and resistance to change in order to know if the agent should 
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-{ *• Intention to -« 

| *■ Intention that ■* — 
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X's Private mental states 
resulting^ intentions 



Action commitments from X to Y ■* — 
Action commitments from Y to X ■*- 

Propositionnal commitments from X to Y 
Propositionnal commitments from Y to X 



4 — Offer 
i- Request 

4 — Inform 
4 — Ask 



V 



Y 

X is reasoning about possibilities for 
the public social layer 



J \ J 



Dialogue games 
X can use to try getting 
wanted social commiments 
accepted 



Fig. 3. Links between private cognitions, public cognitions and DIAGAL dialogue 
games. 



persist or not) would lead to different individual commitment types in a way 
analogous with that of Rao and Georgeff [15]. The main difference is that this 
choice, like others, would be dynamically based on expected utility, i.e. expected 
coherence gain. 

In Figure 3, we sum up (hiding the quantitative level of calculus) the means 
by which we link intentions, social commitments and DIAGAL dialogue games. 
From the left to right we have two types of intentions linked with the four 
possible corresponding commitments types (the four ones seen in section 3.4). 
Notice that until they have been really discussed, those commitments are only 
potential commitments generated by the agent to reason with. To cohere with 
one of its accepted intentions, an agent will usually (according to the expected 
utility calculus) consider trying to get the corresponding commitment accepted. 
To make such an attempt, the agent will choose a DIAGAL dialogue game whose 
success condition unifies with the wanted commitment. 



5.3 The Expected Utility Function 

As we have seen in section 2.1, the whole agent cognitive coherence is expressed 
as the sum of weights of satisfied constraints divided by the sum of weights of all 
constraints 7 . At each step of his reasoning, an agent will search for a cognition 
acceptance state change which maximizes the coherence increase, taking into 
account the resistance to change of that cognition (technically a 1-optimal move) . 
If this attitude is a commitment, the agent will attempt to change it through 
dialogue and if it is an intention, it will be changed through attitude change. In 
that last case, we call the underlying architecture of the agents to spread the 
attitude change and re-deliberate. 

In our implementation, an agent determines which is the most useful cogni- 
tion acceptance state change by exploring all states reachable from its current 
state and select the cognition which can in case of a successful change be the 

7 Notice that the general coherence problem: to give the elements partition between 
A and R that maximize coherence is NP-complete. A formal demonstration could 
be found in [21]. 
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most useful to change. A state is said to be reachable if it can be obtained from 
the current state by modifying only one cognition. Since all cognitions cannot 
be equally modified, we introduced a notion of cost to take into account resis- 
tance to change or sanctions associated to cognitions. All explored states are so 
evaluated through an expected utility function expressed as below: 

g(exploredState ) = coherence(exploredState)—cost(cognitionC hanged) 

where exploredState is the evaluated state, cognitionC hanged is the cognition 
we are examining the change, and cost is a cost function expressed as: 

1. if cognitionC hanged is an intention, its cost of change equals its resistance 
to change; 

2. if cognitionC hanged is a rejected commitment, its cost of change equals its 
resistance to change, which is initially low but which could be increased at 
each unfruitful attempt to establish it (depending on the agent’s commitment 
strategy as we will see in the next section); 

3. if cognitionC hanged is an accepted commitment, its cost of change is pro- 
vided by its associate sanction (which could be null). 

5.4 The Treatment Algorithm 

Our agents behavior is guided by their coherence and their social commitments. 
At each step of the simulation, our agents consult their agendas and behave in 
order to fulfill the commitments which have been deduced from previous actions 
of agents and rules of dialogue games. When agents must determine the actions 
they have to produce, they apply the following algorithm: 

Procedure CommunicationPragmaticsQ 
1: List commitments=agenda.getCommitments(); 

2: List dialogCommitments=agenda.getDialogCommitments() ; 

3: treatCommitmentsQ; 

4: if dialogCommitments.isEmptyO then 
5: initiateDialogueQ; 

6: else 

7: treatDialogCommitments(); 

8: end if 

As we have seen in section 3.1, we distinguish between two types of commitments: 
the dialogical ones and the extra-dialogical ones. The procedure for treating the 
extra-dialogical commitments (line 3) consists in updating the cognitive model of 
the agent by browsing extra-dialogical commitments in the agenda and operate 
as follows. (1) Each time an accepted commitment is encountered, the corre- 
sponding commitment in the agent’s cognitive model is marked as accepted. If 
the corresponding intention in the cognitive model of the agent is rejected, then 
the agent call the underlying BDI architecture for a possible attitude change 
process. (2) Each time a rejected commitment is encountered, the resistance to 
change of the corresponding potential commitment in his cognitive model is in- 
creased, so that after eventually several unsuccessful attempts, this commitment 
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will be so expensive to establish that it will not constitute a useful change of 
cognition 8 . This last case would lead to attitude change. This operation is per- 
formed before treating the dialogical commitments in order that as soon as a 
commitment is established, it is taken into account in the rest of the dialogue. 

The procedure of initiating a dialogue (line 5) consists in searching for the 
most useful cognition to change 9 . If it is a commitment, the agent initiates 
a dialogue with the appropriate dialogue game, or begins an attitude change 
process if it is an intention. The choice of the appropriate dialogue game is made 
by unifying the commitment the agent wants to establish with the conditions of 
success of the games loaded in the simulator. 

Treating dialogical commitments (line 7) consists in exploring all the possible 
actions that are determined by dialogue games and selecting the one which has 
the best consequences for coherence. If the extra-dialogical commitment which 
is concerned by the current game is not the most useful change for the agent, 
it will embed a game by proposing the entrance in a new, subjectively more 
appropriate, dialogue game. 

Notice that coordination of dialogue turns is ensured by the dialogue games 
rules and the resulting dialogical commitments order in the agents’ agendas. 
Finally, this algorithm is called each time: 

— the underlying BDI architecture finishes a deliberation process (or a re- 
cleliberation process after a call initiated by our algorithm for an attitude 
change process). We assume that the produced intentions are either social 
individual intentions or individual intentions that the agent could not realize 
alone. 

— the agent has something in his agenda. This ensures, that the agent re- 
execute this algorithm until all dialogs are closed and that the agent will 
treat dialogue initiated by others. For example, when the agent receives a 
prop.in message for entering a particular dialogue game, the corresponding 
dialogical commitment given by the contextualisation game is added to his 
agenda. Notice that, we assume as a first simplification that the agent is 
dialogically cooperative and that he systematically accept entering the game 
(in the treatDialogCommitmentsQ procedure). 

Finally, we implement JACK™ BDI 10 agents using this pragmatics frame- 
work to manipulate DIAGAL dialogue games within the DGS. 

8 Notice that following Rao and Georgeff vocabulary [15] the amount of the increase in 
resistance to change will lead to the different commitment strategies: if this increase 
in the resistance to change is null the agent will be blindly committed in trying to 
get this social commitment accepted, if the increase is drastically important this 
individual commitment will be an open-minded one and in between, we would get a 
wild range of single minded commitment strategies. Notice that those commitment 
strategies could dynamically depend on: the incoherence magnitude, the dialogue 
topic, the partner, the social context,. . . 

9 There could be none, for example if the coherence is already maximal. 

10 JACK is a commercial JAVA agent framework due to Agent Oriented Systems (AOS) 
which implements PRS (Procedural Reasoning System) and dMars (Distributed 
Multi Agent Reasoning System) concepts [8]. 
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Fig. 4. Cognitive models of Paul and Peter. 



6 Example 

Let’s assume that we have two agents, Paul and Peter, who have agreed on 
a common plan to go to the concert of their favorite band and split the bill. 
A subtask of this plan is to go to buy the tickets at the store. Paul has been 
assigned this task and is now about to deliberate about the way he will go to 
the store. He has to choose between two mutually exclusive intentions: the one 
of taking a cab and the one of going by foot. We assume that Paul’s underlying 
BDI architecture has accepted the first one and rejected the second one (perhaps 
in order to save time). As they will split the bill (and that taking a cab costs 
money), Peter would rather that Paul went by foot. Thus, he has the rejected 
intention that Paul take a cab and the accepted one that Paul go by foot. 

Both intentions may be associated with two corresponding potential com- 
mitments (according to links established in section 5.1): the social commitment 
from Paul toward Peter to take a cab and the social commitment from Paul 
toward Peter to go by foot. In addition, the commitment to take a cab and the 
intention of taking a walk are incompatible, as well as the commitment of taking 
a walk and the intention of taking a cab. From this initial state, according to 
our model, a positive constraint between intention and pending commitment is 
induced from the correspondance relation and negative constraints are induced 
from the the mutually exclusive relation and the incompatibility relations. Fig- 
ure 4 presents the network of intentions of both Paul (on the left side) and Peter 
(on the right) as well as the pending rejected commitments. Notice that the 
commitments represented are potential commitments used by agents to reason. 
At this stage, they are not real social commitments since they have not been 
established by dialogue. In this example, a weight of 1 has been affected to all 
constraints as a simplification 11 . 

In DGS, we can decide which agent has the acting initiative, thus determining 
on whom incoherence dialogue will be taken. We will assume that Paul has the 
initiative. Initially, as shown by Figure 4, Paul has three satisfied constraints 

11 Considerations about the hybrid symbolic, connextionist. knowledge representation 
techniques would get us out of the scope of this article. We refer the interested reader 
to Sun’s work [19]. 
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Fig. 5. Sates explored by Paul 

(number 1, 3 and 4) in an amount of five constraints so it has a coherence of 0.6. 
Paul will therefore try to increase it by localizing the most useful cognition to 
change. The Figure 5 shows the different states that can be reached by Paul from 
its initial situation. Below each is indicated the coherence c obtained in this state 
as well as the value of the expected utility function g. According to those results, 
Paul will make an attempt to get the commitment C (Paul, Peter, take M-Cab) 
accepted. Since it is a social commitment, Paul will use one of the dialogue games 
which are tools to attempt establishing commitments. Since this commitment is 
a commitment toward Peter, Peter will be the dialogue partner. Paul will then 
choose between the available dialogue games whose success conditions unify with 
the desired commitment. The only DIAGAL dialogue game which has a success 
condition of the form C (initiator, partner, action) is the offer game. 

Paul will thus propose to Peter to play this game, we suppose that Peter 
is dialogically cooperative and would accept to play the game. Then, according 
to the request game rules, Paul will produce a directive speech act with an 
appropriate illocutionary force intensity degree 12 . 

Before replying, Peter will check if he does not have a higher incoherence 
to reduce by searching its own most useful change of cognition and locate the 
commitment from Paul toward him to go by foot, as shown on figure 6. 

Thus, Peter will embed a DIAGAL request dialogue game concerning this 
commitment. Paul will answer Peter according to its coherence (which would 
decrease in case of acceptance) and deny the proposition and the resistance to 
change of the still rejected commitment will increase. The embedded request 
game is then closed. To illustrate the attitude change, we have drastically in- 
creased the resistance of change of the commitment of taking a cab in order that 
Peter’s expected utility function will select the intention that Paul went by foot 

12 We illustrate our example with the use of basic illocutionary forces intensity degree 
for the speech/dialogue acts (here the “offer”), but DIAGAL allows us to choose a 
specific strength degree for each speech act. Thus, the strength degree could have 
been linked to: (1) Paul’s current incoherence magnitude, (2) Paul’s expected in- 
crease of coherence, that is the expected utility and (3) social positions of Peter and 
Paul, . . . 
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Fig. 7. Cognitive models of Paul and Peter. 



as the most potentially useful change. At the end of this embedded dialogue 
game, Peter’s treatCommitmentsQ procedure will recall the underlying BDI ar- 
chitecture for a re-deliberation which would at least include the rejection of the 
“intention to” that Paul went by foot. 

Propagating attitude change and re-deliberation (which would normally be 
processed by the underlying architecture) is simulated in our present system 
by systematically revising as many intentions as possible as long as it increases 
whole coherence. The new cognitive models of the agents after this dialogue 
are those of Figure 7. Paul’s intentions remains unchanged since no social com- 
mitment conflicts with its intentions while Peter’s ones have been reevaluated. 
Peter, according to his new set of intentions will then accept Paul’s offer to take 
a cab and they will finally quit the embedding dialogue offer game. After this 
dialogue, both agents will have all their constraints satisfied (i.e. a coherence 
of 1). 



Resulting Dialogues. The diagram of sequence shown on Figure 8 illustrates 
the messages exchanged between Paul and Peter as detailed above. This diagram 
is actually part of the action board which DGS fills during the execution so that 
the user can see what the agents are doing. 
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Fig. 8. Dialogues between Paul and Peter. 



The two dialogue games initiated by Paul and Peter are presented as well 
as speech-acts used by both agents. Notice that all those steps were held auto- 
matically by the agents implementing our coherence theory for communication 
pragmatics in the way described earlier. 

In the case where Peter is given the initiative at the beginning, the symmet- 
rical dialogue would have happened, Peter trying to establish the commitment 
of going by foot, Paul imbricating a game on the commitment of taking a cab, 
denied by Peter and both finally agreeing on Paul going by foot. In that case the 
dialog result in the opposite situation. This is normal since we consider that the 
commitments socially rejected by dialogue gain a very high resistance to change. 
It results in a non-persistance of intentions in case of refusal (i.e. an highly in- 
fluenceable open-minded commitment strategy). In that particular case (chosen 
in order to simplify the exemple), dialogue initiative plays a crucial role. 



7 Conclusion and Prospects 

The cognitive coherence theory for agent communication pragmatics allows mod- 
elling a great number of agent communication dimensions while being compu- 
tational. This paper describes our exploration in applying the cognitive coher- 
ence pragmatics theory for BDI agents communication. The presented practical 
framework relies on our dialogue games based agent communication language 
(DIAGAL) and our dialogue game simulator toolbox (DGS). It provides the 
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necessary theoretical and practical elements for implementing the theory as a 
new layer over classical BDI agents. In doing so, it brought a general scheme for 
automatizing agents communicational behavior. 

Classically, practical reasoning equals deliberation plus means-ends reason- 
ing. Deliberation is about deciding what states of affairs the agent wants to 
achieve whereas means-ends reasoning is about deciding how to achieve these 
states of affairs. Within our model, coherence gain evaluation trough the ex- 
pected utility function extend the deliberation process to take into account the 
social level whereas selecting a dialogue games by unifying its success condi- 
tions with the wanted social result is part of the mean-end reasoning. We also 
insist on the dialogue effect on agent’s private mental states through the atti- 
tude change process. This process is activated by a kind of reconsider() function 
(see [15]) which has been modelled and integrated in our expected utility func- 
tion and which results depends on the chosen individual commitment strategy 
(which is taken into account when the resistance to change of explicitly rejected 
commitments are updated). 

Although the architecture presented in this paper is efficient, much more 
work remains to be done. In particular we want to: (1) work more profoundly 
on the links between private and public cognitions (2) provide a well-founded 
theory for sanction and social relations dynamic management 13 (3) extend the 
current framework with argumentation seen as constraints propagation allowing 
agents to reason about others’ cognitive constraints and thus taking them into 
account, introducing cooperation. 

In this article we choose to apply our theory as a new layer above the existing 
BDI architectures. But, a long term work would be to propose a pure colrerentist 
approach for the whole cognitive agents architecture. This would permit to take 
more advantage of the power of colrerentist approaches [20] , using the powerful 
hybrid symbolic-connexionist formalisms attached with them. 
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Abstract. In this paper, we describe several interesting design decisions we 
have taken (with respect to inter-agent messaging) in the re-engineered CASA 
architecture for agent communication and services. CASA is a new architecture 
designed from the ground up; it is influenced by the major agent architectures 
such as FIPA. CORBA, and KQML but is intended to be independent (which 
doesn't imply incompatible). The primary goals are flexibility, extendibility, 
simplicity, and ease of use. The lessons learned in the earlier implementation 
have fed the current design of the system. Among the most interesting of the 
design issues are the use of performatives that form a type lattice, which allows 
for observers, who do not necessarily understand all the performatives, to none- 
theless understand a conversation at an appropriate semantic level. The new de- 
sign considerations add a great deal of flexibility and integrity to an agent 
communications architecture. 



1 Introduction 

We have been working on an infrastructure for agent-based system that would easily 
support experimentation and development of agent-based systems. We implemented 
the first-cut system, and began work on the analysis of what we could learn from our 
experience. 

We had gained enough knowledge to formalize our theory of agent conversations 
based on social commitments [5], but the CASA infrastructure itself, although seem- 
ingly quite adequate, seemed rather ad-hoc, and lacking in solid theoretical founda- 
tion. 

The various types of service agents in the CASA system each sent and received 
what seemed to be a set of ad-hoc messages, which did the job, but still seemed unsat- 
isfying in that there seemed to be no apparent pattern to them. We had more-or-less 
followed the FIPA [8] performative model (the CASA message structure closely 
follows the FIPA structure, which, in turn borrows heavily from KQML [1]). 



F. Dignum (Ed.): ACL 2003, LNAI 2922, pp. 59-74, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 




60 Rob Kremer, Roberto Flores, and Chad La Fournie 



We decided to take a hard look at the system with an eye to looking for patterns to 
modify our design to a more satisfying structure. This paper describes some of our 
analysis that lead to the current design. 

In the remainder of this section we give a brief introduction to the CASA infra- 
structure. In Section 2, we describe various problems and solutions (a.k.a. design 
decisions) we have encountered in our building and using the CASA system. These 
include the arranging of performatives in a type lattice, specification of some rela- 
tional constraints, and the resulting compositional properties of some messages, the 
addition of several new fields in messages, and the removal of some performatives 
into a separate type lattice called acts. In Section 3 focus on conversations, rather than 
messages, and describe how CASA can support either conversation protocols, or 
social commitments as a basis for inter-agent communication, and weigh some of the 
merits of the two different approaches. In sections 4 and 5 we discuss our future work 
and conclusions. 



1.1 Background: CASA 

The fundamental objective of CASA is to support communication among agents and 
related services. Therefore, CASA offers a generic communication and cooperation 
infrastructure for message passing, archiving (data, knowledge bases, and transaction 
histories), agent lookup, and remote agent access. It is an open system infrastructure 
that may be easily extended as the future need for services becomes apparent. CASA 
makes no demands on intra - agent architecture (internal agent architecture), however, 
agent templates are provided in the form of classes that one can inherit from and 
specialize (in Java and C++). Currently, generic class-templates are provided and 
conversation-protocol and social-commitment specialized class-templates are under 
development. 

As shown in Figure 1, Cooperation Domains (CDs) acts as a central “hub” for 
multi-agent conversations such that all participants may send messages directly to the 
Cooperation Domain for point-to-point, multi-cast, type-cast 1 , and broadcast commu- 
nication within the Cooperation Domain (a group of agents working together on some 
task). Agents within a cooperation domain may also use the cooperation domain to 
store persistent data that will permanently be associated with the conversation, giving 
the conversation a lifetime beyond the transient participation of the agents, as is often 
required. Cooperation Domains may also store transaction histories for future play- 
back of the chronological development of the conversation artifacts. Cooperation 
Domains may perform all these tasks because all messages use a standard "envelope" 
format (currently either KQML [1] or XML [15]) 2 , which flexibly provides a basic 



1 Type-cast is similar to multi-cast, but the destination specification describes agent types 
(rather than individuals), and the cooperation domain forwards the message to agents known 
to conform to the type(s). 

2 Actually, the envelope formats can be either KQML or XML (mixed) as we can dynamically 
switch between the two owing to our employing the Chain of Command pattern [10] to in- 
terpret messages. 
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Fig. 1 . Basic Communication and Cooperation Services in CASA. 

semantic “wrapper” on messages that may be otherwise domain specific: both the 
utility of generic services and the efficiency of domain specific languages are there- 
fore provided. 

Yellow page servers (also called Yellow Page Agents, Yellow Pages, or YPs) allow 
agents to look up other agents by the services they offer. An area is defined nomi- 
nally as a single computer (but could be a cluster of several computers, or a partition 
on a single computer). There is exactly one local area coordinator (LAC) per area, 
which is responsible for local coordination and tasks such as “waking up” a local 
agent on behalf of a remote agent. All application agents reside in one or more areas. 
See Figure 2 for a screen dump of a LAC interface showing several agents running. 

Messages are needed to support interactions among agents. Messages are geneti- 
cally defined either as Request, Reply or Inform messages; where Requests are used to 
ask for the provision of a service. Replies to answer requests, and Informs to notify 
agents without requiring the receiving agent to perform any action. Since these mes- 
sages are too ambiguous for the definition of interaction protocols, other, more mean- 
ingful, message subtypes are derived from these general definitions of messages. For 
example. Refuse is derived from Reply, an agent can Refuse a earlier Request of an- 
other agent as a special type of Reply (as Agree would be another special type of 
reply). 



2 Messages 



In our analysis of the system we first looked at the performatives in the messages, and 
existing F1PA performatives. These include: 

Accept Proposal Agree 

Call for Proposal Confirm 

Failure Inform 

Inform Ref Not Understood 

Propose Proxy 

Query Ref Refuse 

Request Request When 

Subscribe 



Cancel 
Disconfirm 
Inform If 
Propagate 
Quay If 
Reject Proposal 
Request Whenever 
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Fig. 2. A screen dump of a CASA LAC interface with a Cooperation Domain and two simple 
‘chat’ agents communicating with one another. All three agents are running in the same proc- 
ess here, but they may all run in independent processes or on different machines. 



2.1 The Problem 

The FIPA performatives seem rather awkward when one tries to use them in a real 
application such as this. The meaning of each and its relationship to the others is not 
always clear. In some cases, it seems they have overlapping meaning. For example, 
Refuse and Reject Proposal can occur under the same circumstances, and the former 
would seem to be a subtype of the latter. Thus, it they should be arranged in a type 
lattice. Furthermore, our use of them (in developing the CASA infrastructure) seemed 
to indicate that Inform, Request, and Reply were somehow much more fundamental 
than the others. 

A common technique used in communication protocols (especially in the face of 
unreliable communication channels) is to always have the receiver return a confirma- 
tion (an acknowledge message) to the sender so that the sender can verify that the 
message was received (and resend if necessary). Such an Ack message is conspicu- 
ously missing from the FIPA performatives, so we chose to extend FIPA's protocol 3 , 
and go with an Ack (acknowledgement) to an Inform to allow us to check on trans- 
mission of a message in the event we are using an unreliable communication channel. 



3 In fact, we're not sure if we really are extending FIPA's protocol, since we could find no 
protocol for a simple Inform. So we aren't sure if a FIPA agent would expect an Ack after an 
Inform or not. 
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(Note that we do not require agents to use the Ack protocol, but agents may agree to 
use it if so desired. 4 ) 

When we tried to draw a type hierarchy of all the messages required in CASA, we 
found it looked something like this 5 : 




Fig. 3. 

For example, when an agent wishs to join a cooperation domain, it sends a joinCD 
(lattice B) request to the cooperation domain, and the cooperation domain responds 
with a joinCD (lattice B) reply. 



2.2 Rearranging Performatives 

It seems interesting that for each specific Request, we have a mirrored Reply and, 
similarly, for each specific Inform we have a mirrored Ack. It seems to us that Re- 
quest! Reply and Inform! Ack are more fundamental than their subtypes. It appears we 
should keep Request/Reply and Inform! Ack as "fundamental message types", and then 
put all the other things in a separate type lattice of "ACTs". We extend our message 
headers to include a new field act. This more-or-less turns performatives into speech 
acts, and acts into physical acts. 

Thus, our message header looks like this: 

performative: request 

act: inviteTo JoinCD 

to : Bob 

from: Alice 

receiver: CDagentl 

sender: Alice 



Getting back to acknowledging protocols, we can write down a very simple con- 
versation protocol: that of an Inform! Ack pair (here we use diamond-shaped arrow- 
heads to indicate sequence): 



4 Actually, an agent who isn ’t using the Ack protocol will Ack if it receives a message marked 
with a request-ack field - see Figure 13 and Section 2.7. 

5 This type hierarchy is a simplification of the hierarchy we built containing all the performa- 
tives in the FIPA Communicative Act Repository Specification (FIPA00037) [7], which 
ended up being fairly complex. Although simplified, the above diagram captures the essence 
of the original. 
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c 



inform 



> 



ack ) ) 



Fig. 4. 

This adds a certain sense of reliability to the message transmission: Alice will al- 
ways know (by the Ack ) that Bob received her Inform 6 . In the case of a Re- 
quest/Reply. if Bob received a Request from Alice, and Replies, then Alice will know 
that Bob received the Request, so there is no need for an Ack in this case. But Bob 
(the receiver of the request) has no way of knowing if Alice received the Reply. It 
would be appropriate for Alice to send an Ack to Bob: 

(f request reply ack 

Fig. 5. 



If this is the case, then Reply looks a lot like inform. It's a specialization (subtype) 
of Inform: 






C inform C ack C request 
C reply ^ 



Fig. 6. 

This makes sense: after all, if I’m replying to you. I’m informing you of what I’m 
doing with your request. 

Furthermore, it would be consistent to think of a Reply as being a kind of Ack to a 
Request. After all, if we reply to a request, the reply certainly carries all the function- 
ality of an Ack. Therefore, Reply would be a subtype of both Inform and Ack: 

■ ? ■ 

C inform ack ") request 

C repiy J 

Fig. 7. 

By the same reasoning as a reply is an inform, we can derive that a request must 
also be in an form. After all, if I’m requesting something of you, I’m informing you 
that I want you to do something for me: 



6 But Alice won't know if Bob didn't receive the Inform, since the Ack could have been lost. 
This is a well known problem in the literature. 
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inform 


c 


ack 


D 




t A 








c 


request 'f 


c 


reply 


D 



Fig. 8. 

This seems to be a rather satisfying configuration, although slightly asymmetrical. 



2.3 Adding Constraints 

We will overload the last diagram by adding a sequence relation to indicate that an 
Inform is followed by an Ack : 



. g . 

C inform ) 4 m ack 

3 request 3 4^ reply 



3 



Fig. 9. 

One can observe that an Inform must be followed by an Ack, and a Request must 
be followed by a Reply (and not an Ack). This does not violate the Inform! Ack con- 
straint since Reply is a subtype of Ack anyway. In addition, the above diagram says 
that one can't Ack a Request, because it's not a Reply 1 . Just what we want. 



2.4 Composing Messages 

All this has interesting consequences in implementation. Nothing in the CASA mes- 
sage exchange changes, but it does tell us that a request protocol is a composition of 
two inform protocols (where the [overloaded] middle message is a specialized Ack): 



c 


inform 


3 




c 


request 


> 



>c 



ck intol 

-4 f reply f - 






3 

3 



Fig. 10. 



7 One might argue that a request might be followed by both an Ack and a Request, and con- 
form to this diagram, but this cannot be because in a proper conversation thread, speech acts 
are well ordered: one message cannot be followed by more than one other message. 
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This composition is legal only because we had previously observed that a Reply is 
a subtype of both Ack and Inform. The fact that we can do this composition lends 
support to our earlier analysis that Request is a subtype of Inform and Reply is a sub- 
type of both Inform and Ack. 



2.5 When Things Go Wrong 

All this is fine when the world is unfolding as it should. But sometimes things don't 
go as planned and errors happen. An agent may want to negatively acknowledge 
( Ncick ) an inform ("I don't believe you", "I don't understand the message"). Or an 
agent may be unable (or unwilling) to perform a request ("Sorry, can't do it"). Or a 
message may simply not be returned for whatever reason. 

Since these forms of failure can be regarded as being speech acts (as opposed to 
physical acts), we can simply add them as subtypes of replies (which also makes them 
subtypes of acks as well): 



2 



( f inform ^ 

request ~f)~ 



ack 



> 



c reply (f nack ~) 



c 



~^ff) (f refuse f) (f timeout f) (jnol-underslootP) 



Fig. 11. 



Thus, conversation specifications (protocols), don't need to worry about error con- 
ditions: since an error is just a specialization of a reply or an ack, it is a "normal" part 
of a regular conversation, at least at the speech-act level. Agents may and will alter 
their behaviour when errors occur, but there is no reason to unduly complicate a pro- 
tocol description if the circumstances do not warrant it. 

The errors defined in Figure 1 1 are: 

- error, which is general catch-all for some exceptional error that may occur. 

- refuse, which is used when an agent explicitly refuses a request (or possibly to 
deny an inform, similar to the FIPA disconfirm). 

- not-understood, which is used to inform the sending agent that the receiving agent 
cannot interpret the message. 

- timeout, which is a special case explained below. 

There are several ways to handle the case where an acknowledge to an inform or a 
reply to a request is not returned. We find it very convenient to make use of a timeout 
message, which, in fact, are not ordinary messages in that they are not normally sent 
between agents. But rather, agents "send" timeout messages to themselves as a simple 
"bookkeeping" technique: Whenever a message is sent it includes a timeout field 
which specifies by when an acknowledge/reply is expected. If no acknowledge/reply 
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Fig. 12. Partial ontology of messages. This diagram describes this ontology in terms of typing 
(the solid arrows represent the is -a relationship), sequence (the V-arrows represent the direc- 
tion of the message, as in "from-agent -> message -> to-agent"), and grouping (the messages 
in the dotted boxes all conform the to relationships impinging on the dotted box). Thus, we 
may read the advertise message as being a subtype of register, and being sent by an Abstrac- 
tAgent and received by either a AbstractYellowPages agent or an AbstractLAC agent. 



is received by the time the timeout expires, the agent merely "sends" a matching 
timeout message to itself. This technique greatly simplifies agent logic because agents 
are always guaranteed to receive a reply (even if it may only be a timeout). In addi- 
tion, since overworked receiving agents also are aware of the timeout encoded in the 
message header, they may safely ignore expired messages without further processing. 



2.6 Acts 

Now we turn our attention the type lattice of acts we had mentioned in Section 2.2. 
We have created a separate type lattice of acts that is somewhat subservient to the 
smaller type lattice of performatives. We attempted this type lattice for the fundamen- 
tal messages necessary to manage the CASA infrastructure (see Figure 12) 8 . 

The figure shows remarkable regularity. There are three fundamental (but not nec- 
essarily exclusive) types of acts: create, destroy, and get, corresponding to the three 
fundamental things one can do with data: write, delete, and read. For each of the 
agent types, there is a corresponding triplet of messages that represent it's functional- 
ity. For example, an agent may request of a YellowPages agent to be advertised (cid- 



The initial formulation of the ACT type lattice, of course, was not so clean. The process of 
building this diagram caused us to tweak our design by renaming a few messages, combining 
similar functions into single more powerful functions, and adding a few messages that had 
remained missing. 
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vertise), to be removed from the list of advertised agents {unadvertised), or to search 
for some other agent among the YellowPages agent's registered advertisers (search). 
This otherwise perfect symmetry is broken only in one case: An agent may delete a 
history list ( deleteHistoryCD ) and read the history list ( getHistoryCD ), but it may not 
write into the history list (the history list is written to as the Cooperation Domain 
records all the messages it processes). 



< ! DOCTYPE CASAmessage [ 

<! ELEMENT CASAmessage (version, performative, act?, sender. 


receiver. 


from?, to?, timeout?, reply-with?, in- reply- to? , 


request - ack? , language?, language- version? , ontology?. 


ontology - 


version?, content? 


)> 


< ! ELEMENT 


version 


(# PCDATA) > 


< ! ELEMENT 


performative 


(# PCDATA) > 


< ! ELEMENT 


act 


(# PCDATA) > 


< ! ELEMENT 


sender 


(# PCDATA) > 


< ! ELEMENT 


receiver 


(# PCDATA) > 


< ! ELEMENT 


from 


(# PCDATA) > 


< ! ELEMENT 


to 


(# PCDATA) > 


< ! ELEMENT 


timeout 


(# PCDATA) > 


< ! ELEMENT 


reply-with 


(# PCDATA) > 


< ! ELEMENT 


in- reply- to 


(# PCDATA) > 


< ! ELEMENT 


request -ack 


(# PCDATA) > 


< ! ELEMENT 


language 


(# PCDATA) > 


< ! ELEMENT 


language - version 


(# PCDATA) > 


< ! ELEMENT 


ontology 


(# PCDATA) > 


< ! ELEMENT 


ontology- version 


(# PCDATA) > 


< ! ELEMENT 

]> 


content 


ANY > 



Fig. 13. A CASA message DTD (there are additional elements for thematic roles and semantic 
modifiers, but these have be omitted in the interests of brevity. 



2.7 A More Powerful Message 

As already mentioned, CASA exchanges messages with headers in either KQML 
format or XML format. Both of these formats contain exactly the same information, 
so it matters little (except that XML is more verbose) which is used. The XML DTD 
in Figure 13 describes our message format. 

Most of the fields in the message are either described above or are standard to 
FIPA [8]. 

Why did we extend the FIPA message header format? Largely, this is because we 
are concerned with the observable properties (as opposed to the internal, private 
properties) of agent interaction. We want an external observer to be able to judge the 
soundness (conformance to social conventions or protocols) of a conversation without 
necessarily understanding all of the subtleties and details of the conversation. That is, 
an external observer should be able "understand" a conversation even though the 
observer does not understand the language and ontology used in the content section of 
the message. FIPA's message format is a very good foundation for such understand- 
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mg. For example, the ideas of the performative label to enable an observer to superfi- 
cially understand the message type; the idea of the separate sender/receiver and 
to/from labels to enable an observer to understand when a message is merely be for- 
warded as opposed to being actually addressed to the recipient agent; the idea of the 
language label to enable the observer to judge whether the observer can understand 
the content, and the reply-with and in-reply-to labels to enable the observer to disam- 
biguate multiple concurrent conversational threads. However, our experience with 
strict reliance on observable properties motivated us to put more "semantic" informa- 
tion in the message. These include: 

- We have arranged the performatives in a type lattice to enable an external ob- 
server, not conversant in the details of some specialized conversational domain, to 
understand the conversation based on being able to follow a particular foreign 
performative (or act) up the type lattice until the observer finds a performative (or 
act) that is understood (see section 2.2). 

- We have added a separate act label, motivated by our observation that the more 
fundamental performative, such as inform, request, and reply, are qualitatively 
different from the lower-level "performatives" (see section 2.6). We have there- 
fore taken out some of FlPA's (and or own) "performatives", and placed them in a 
separate type lattice under the message label act. 

- We have added an ontology label, which is intended to extend the F1PA language 
label, since an observer may understand a certain language (e.g. prolog) but not 
understand the ontology used (e.g. the prolog predicate library used). 

- We have also qualified the message and language and ontology labels with ver- 
sion numbers to support the evolution of these definitions. The version numbers 
allow observers (and conversation participants) to recognize when they have an 
older version of the message/language/ontology definition, and may not under- 
stand the entire contents, or when they are dealing with a conversation participant 
who has an older definition, and may be able to adjust their interpretations and ut- 
terances to match. 

All of these extensions are optional, and allow us to still be compliant with FIPA 
message, so long as the FIPA agents are willing to ignore foreign labels in incoming 
messages. Figure 14 shows an example of a CASA request/reply message pair. 

These extensions allow an external observer to monitor messages and make sense 
of them without having the omniscient and gaze into the internal "thought process" of 
the participating agents. While the original FIPA definition of the agent allows this to 
some extent, our extensions support further semantic interpretation of messages, and 
offer pragmatic solutions to some of practical problems that occur in real-life, evolv- 
ing situations. 

All of this is useful, but must be extended to the composition of messages, which is 
described in the next section. 
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1 ( : reguest 1 


:act 


register . instance 


: sender 


casa: / /kraner@192 .168.1.42: 8700/casa/GocperatianEarain/coolness 


: receiver 


casa : //krarer@192 .168.1.42:9000 


: timeout 


1066456360438 


: reply -with 


casa : //kramsr@192 .168.1.42: 8700/casa/GncperatiariEarain/coolness - - 0 


: language 


casa . URLDescriptor 


: content 

) 


"casa : //krarer@192 .168.1.42: 8700/casa/Gx3perationEarain/coolness true" 


( : reply 


:act 


register . instance 


: sender 


casa: //kramsr®192 . 168 .1.42 : 9000/casa/IAC/ResearcbIAC 


: receiver 


casa: //kramsr®192 . 168 .1.42: 8700/casa/GecperatianEarain/coolness 


: timeout 


1066456360438 


:reply-with 


casa: //krarer@192 . 168 .1.42 : 9000/casa/IAC/ResearcbIAC- - 1 


:in-reply-to 


casa: //krarer@192 .168.1.42: 8700/casa/GecperatianEarain/coolness - - 0 


: language 


casa . StatusURLandFile 


: content 


" ( 0 \"Success\" 


casa: //krarer@192 . 168 .1.42: 8700/casa/CooperatianEarain/coolness#lac=9000 


V'/casaNew/root/casa/GecperatiariEarain/coolness.casaV 1 ) " 

) 



Fig. 14. An example of a CASA request/reply message pair using the KQML message format. 
In this case, a Cooperation Domain is registering with a LAC (to let the LAC know it is run- 
ning). The LAC responds with a success status together with the Cooperation Domain's fully- 
qualified URL and the file it should use to use to store any persistent data. (Backslashes are 
used as escape characters in the content part.) 



3 Conversations 

In our work with CASA, we are aiming a high degree of flexibility with respect to the 
construction of agents. We therefore choose to design in enough flexibility to allow 
agents that are based on either conversation protocols or on social commitments [4], 

3.1 Conversation Protocols 

When in conversation protocols mode, CASA conversations are modeled by using 
conversation protocols (CPs) to define the possible utterances and responses that can 
be exchanged between agents. Following FIPA standards for interaction protocols [8], 
we define a variety of patterns in which initiators and participants interact. This 
model is flexible in that any new interaction or conversation can be added by simply 
defining the possible messages (such as INFORM/REQUEST), and their counterparts 
(ACK/REPLY). For example, suppose a very simple interaction was needed to model 
two agents: one with the role of requesting a service, and another that may or may not 
agree to provide the service. This can be done by adding the following to each of the 
agents’ list of possible messages: 



Agent 1 


Agent 2 


performative: Request 
act: sendee X 


performative: Reply 
act: sendee X 


performative: Reject 
act: sendee X 
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In addition, the agents' library of reactive protocols in updated (which is consulted 
when a CP agent receive a message "out of the blue" starting a new conversation), 
and a pattern is written to handle the particular protocol. In this fashion, more com- 
plex interactions can be added. 

The internal mechanisms, which determine how an agent will decide what (if any) 
message to send, are implementation dependant. In fact, we leave such details out 
since we wish to concentrate on the way in which agents communicate and on ob- 
servable agent behaviour. We leave the matter of how agents should behave and their 
internal workings to the individual agent developer. The key is that agents involved in 
the interaction require the protocol to be pre-defined. 



3.2 Social Commitments 

The social commitment (SC) model is an alternative to the CP model described 
above, and provides a mechanism by which agents can negotiate obligations among 
one another. A social commitment is an obligation that is negotiated between poten- 
tial debtor and creditor agents (see [5], [4], [11], [6], [2] & [14]). Such negotiation of 
obligations mimics the behaviour of agents using CPs, but in more flexible manner. 
For example, suppose an agent (agent A) were to issue a proposal to another agent 
(agent B). Figure 15 shows the sequence of events and the possibilities agent B can 
reply with. If the debtor agent (agent B) accepts the proposal by responding with an 
accept reply to agent A, it adopts the obligation to provide the creditor agent (agent 
A) with some service. It should also be noted that the commitment is not imposed on 
the debtor agent; it is negotiated. A more complex example might allow for agent B 
to refuse the proposal and make a counterproposal which would, in turn, be evaluated 
by agent A. 



I A:AGENT ~| 



PROPOSE 



ACCEPT 


> 

\ 


^ REJECT 


► 


COUNTER 



B:AGENT 



Fig. 15. A simple proposal. 

The possible replies that might be issued by the receiving agent include accep- 
tance, rejection or counterproposal, and the sending agent may at any time issue a 
withdrawal its own proposal. 

The value of using the social commitments model is twofold. First, it allows for 
linking the external world (including interactions with other agents) with the internal 
agent mechanisms. An agent maintains a list of the current and past obligations both 
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owed to other agents and to itself. This information could be used to “rationally” 
influence the agent’s future actions. Thus, a conversation is never rigidly “scripted” 
as in conversation protocols, but flows naturally from an agent disposing its current 
obligations. Second, it provides a degree of autonomy (through the negotiation proc- 
ess) for adopting commitments. An agent is free to accept or reject proposals as de- 
scribed above. 



3.3 Observer Agent 

One aspect that enhances the flexibility of the CASA model is that agents can be used 
in various capacities other than being directly involved in conversations. A suffi- 
ciently privileged agent may join a cooperation domain and observe the communica- 
tion between other agents. This agent, which may simply be a “plug-in agent” for the 
cooperation domain, could perform various functions. It could perform the role of a 
conversation manager, tracking and organizing the messages sent between agents; it 
could perform the role of janitor, cleaning up obsolete conversations and connections; 
or it could perform the role of a translator, resending messages broadcast in one lan- 
guage in another language. An agent may also observe conversations between other 
agents in order to gather information about how it can best proceed toward its goals. 
Note that, it is owing to the arrangement of the performatives and acts into a type 
lattice that there is sufficient information in the message “envelope” to allow an “out- 
side” agent to “understand” a conversation even if it does not understand the domain- 
specific performatives, acts, content language, and ontology of the participating 
agents. 

In the context of the social commitments model, it might be useful to have a judg- 
ing or policing agent observing interactions between agents, checking on their con- 
formance to social norms, such as whether or not agents discharge their obligations 
within an acceptable period of time (or at all). This is only possible because the SC 
model relies only on obsen’able behaviour. 

The above examples are meant to illustrate the flexibility of CASA, in that it can 
be extended in many ways other than adding additional conversation protocols. It 
supports future developments in the areas of rationality and social knowledge as they 
relate to conversations and interactions among agents in a society. 



4 Future Work 

Although the work in CASA and social commitment theory is progressing well, there 
is still much to do. We have several projects underway in various related research 
fields. We are working on further extending the CASA message “envelope” to in- 
clude more formal reference objects and relationships based on theories from linguis- 
tics (e.g. [12], [13]). For example, we can add fields such as agent (initiator of an 
act), patient (the entity undergoing the effects of an act), theme (the entity that is 
changed by the act), experiencer (the entity that is a aware, but not in control of, the 
act), and beneficiary (the entity for who’s benefit the act is preformed. 
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The detailed semantics of messages, as described here, has not yet been worked 
out. Some of our future work involves the formal specification of message semantics, 
at least as it relates to the services offered within CASA. Some formal specification of 
the social commitment model has been done (see [4] & [5]), but the formal specifica- 
tion of CASA services has hardly been touched so far. 

Furthermore there a several pragmatic areas that need to be worked out if CASA, 
or a CASA-like system is to be used in real settings (such as what we are working for 
manufacturing systems). For example, security will be an issue, and we are currently 
undertaking research to determine an appropriate approach to protecting stored data, 
messages themselves, and cooperation membership in such a dynamic and distributed 
environment. 

5 Conclusions 

In this paper, we have described some of our design decisions in the CASA infra- 
structure primarily with respect to the structure of inter-agent messages. Most of the 
discussion contrasts our structure with that of the FIPA model. We differ from FIPA's 
viewpoint primarily in that we concern ourselves more with the obsen’able behaviour 
of agents, and the ability of an outside observer to understand the meaning of a con- 
versation. Minimally, we would like the outside observer to be able to deduce if the 
agents are participating in logical conversation and conforming to a set of social 
norms. 

To this end, we have conformed to the FIPA message structure with a few changes 
(primarily ordering the performatives in a type lattice), and extending it in several 
ways. We have extended the message structure by adding the fields such as act, and 
ontology. In addition, we have added version numbers as attributes to the message 
language itself, and to the language and ontology fields. 

We also briefly discussed conversations, which are composed of messages, and 
discussed CASA's support for agents based on either conversational protocols or 
social commitments. 
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Abstract. The problem of checking that agents correctly implement the 
semantics of an agent communication language has become increasingly 
important as agent technology makes its transition from the research lab- 
oratory to field-tested applications. In this paper, we show how model 
checking techniques can be applied to this problem. Model checking is a 
technique developed within the formal methods community for automat- 
ically verifying that finite-state concurrent systems implement temporal 
logic specifications. We first describe a variation of the MABLE mul- 
tiagent bdi programming language, which permits the semantics (pre- 
and post-conditions) of ACL performatives to be defined separately from 
a system where these semantics are used. We then show how assertions 
defining compliance to the semantics of an ACL can be captured as claims 
about MABLE agents, expressed using MABLE’s associated assertion lan- 
guage. In this way, compliance to ACL semantics reduces to a conventional 
model checking problem. We illustrate our approach with a number of 
short case studies. 



1 Introduction 

The problem of checking that agents correctly implement the semantics of an 
agent communication language has become increasingly important as agent tech- 
nology makes its transition from the research laboratory to field-tested applica- 
tions. In this paper, we show how model checking techniques can be applied this 
problem, by making use of our MABLE language for the automatic verification 
of multiagent systems [19]. 

Model checking is a technique that was developed within the formal methods 
community for automatically verifying that finite-state systems implement tem- 
poral logic specifications [1]. The name “model checking” arises from the fact 
that verification can be viewed as a process of checking that the system is a 
model that validates the specification. The principle underpinning the approach 
is that the possible computations of given a system S can be understood as a di- 
rected graph, in which the nodes of the graph correspond to possible states of S, 
and arcs in the graph correspond to state transitions. Such directed graphs are 
essentially Kripke structures — the models used to give a semantics to temporal 
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logics. Crudely, the model checking verification process can then be understood 
as follows: Given a system S, which we wish to verify satisfies some property <p 
expressed in a temporal logic L, generate the Kripke structure Ms corresponding 
to S, and then check whether Ms \ =l V, he., whether ip is invalid in the Kripke 
structure Ms . If the answer is “yes” , then the system satisfies the specification; 
otherwise it does not. 

Our approach to model checking for the ACL compliance problem is as fol- 
lows. In a previous paper [19], we described our first implementation of the 
MABLE language. MABLE is a language intended for the design and automatic 
verification of multi-agent systems; it is essentially an imperative programming 
language, augmented by some features from Shoham’s agent-oriented program- 
ming paradigm [15]: in particular, agents in MABLE possess data structures cor- 
responding to their beliefs, desires, and intentions [18]. A MABLE system may 
be augmented by a number of claims about the system, expressed in a simpli- 
fied form of the COIZA language given in [18]. Our MABLE compiler translates 
the MABLE system into processes in PROMELA, the input language for the SPIN 
model checking system [7-9]; claims are translated into SPiN-format LTL formu- 
lae. The SPIN system can then be directly invoked to determine whether or not 
the original system satisfied the original claim. 

In this paper, we show how MABLE has been extended in two ways to support 
ACL compliance testing. First, we have added a feature to allow programmers to 
define the semantics of ACL performatives separately from a program that makes 
use of these performatives, thus making it possible for the same program to ex- 
hibit different behaviours with different semantics. Second, we have extended 
the MABLE claims language to support a dynamic logic-style “happens” con- 
struct: we can thus write a claim that expresses that (for example) whenever 
an agent performs action a, property ip eventually follows. By combining these 
two features, we can automatically verify whether or not an agent respects ACL 
semantics. 

The remainder of the paper is structured as follows. We begin with an 
overview of the ACL compliance checking problem. We then describe a varia- 
tion of the MABLE multiagent bdi programming language, which permits the 
semantics (pre- and post-conditions) of ACL performatives to be defined sepa- 
rately from a system where these semantics are used. We then show how asser- 
tions defining compliance to the semantics of an ACL can be captured as claims 
about MABLE agents, expressed using MABLE’s associated assertion language. 
In this way, compliance to ACL semantics reduces to a conventional model check- 
ing problem. We illustrate our approach with a number of short case studies, 
and conclude with a discussion of future work. 



2 ACL Compliance Verification 

One of the main reasons why multi-agent systems are currently a major area 
of research and development activity is that they are seen as a key enabling 
technology for the Internet-wide electronic commerce systems that are widely 
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predicted to emerge in the near future [5]. If this vision of large-scale, open 
multi-agent systems is to be realised, then the fundamental problem of inter- 
operability must be addressed. It must be possible for agents built by different 
organisations using different hardware and software platforms to safely commu- 
nicate with one-another via a common language with a universally agreed se- 
mantics. The inter-operability requirement has led to the development of several 
standardised agent communication languages (acls) [11,4]. The development of 
these languages has had significant input from industry, and particularly from 
European telecommunications companies. 

However, in order to gain acceptance, particularly for sensitive applications 
such as electronic commerce, it must be possible to determine whether or not 
any system that claims to conform to an ACL standard actually does so. We say 
that an ACL standard is verifiable if it enjoys this property. FIPA — currently 
the main standardisation body for agent communication languages — recognise 
that “demonstrating in an unambiguous way that a given agent implementa- 
tion is correct with respect to [the semantics] is not a problem which has been 
solved” [4], and identify it as an area of future work. (Checking that an imple- 
mentation respects the syntax of an ACL such as that proposed by FIPA is, of 
course, trivial.) If an agent communication language such as fipa’s is ever to 
be widely used — particularly for such sensitive applications as electronic com- 
merce — then such compliance testing is important. However, the problem of 
compliance testing ( verification ) is not actually given a concrete definition by 
FIPA, and no indication is given of how it might be done. 

In [17], the verification problem for agent communication languages was for- 
mally defined for the first time. It was shown that verifying compliance to some 
agent communication language reduced to a verification problem in exactly the 
sense that the term in used in theoretical computer science. To see what is meant 
by this, consider the semantics of fipa’s inform performative [4, p25] : 

(i, inform(j, tp)) 

FP: Bip A ~^Bi(Bifj(fi\/ Ujip) (1) 

RE: Bjtp 

Here (i,inform(j ,ip)} is a FIPA message: the message type (performative) is 
inform, the content of the message is p, and the message is being sent from i 
to j. The intuition is that agent i is attempting to convince (inform) agent j of 
the truth of p. The FP and RE define the semantics of the message: FP is the 
feasibility pre-condition, which states the conditions that must hold in order for 
the sender of the message to be considered as sincere; RE is the rational effect 
of the message, which defines what a sender of the message is attempting to 
achieve. The Bi is a modal logic connective for referring to the beliefs of agents 
(see e.g., [6]); Bif is a modal logic connective that allows us to express whether 
an agent has a definite opinion one way or the other about the truth or falsity 
of its parameter; and U is a modal connective that allows us to represent the 
fact that an agent is “uncertain” about its parameter. Thus an agent i sending 
an inform message with content p to agent j will be respecting the semantics 
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of the FI PA ACL if it believes ip, and it it not the case that it believes of j either 
that j believes whether <p is true or false, or that j is uncertain of the truth or 
falsity of <p. 

It was noted in [17] that the FP acts in effect as a specification or contract 
that the sender of the message must satisfy if it is to be considered as respecting 
the semantics of the message: an agent respects the semantics of the ACL if, when 
it sends the message, it satisfies the specification. Although this idea has been 
understood in principle for some time, no serious attempts have been made until 
now to adopt this idea for ACL compliance testing. 

We note that a number of other approaches to ACL compliance testing have 
been proposed in the literature. Although it is not the purpose of this paper 
to contribute to this debate, we mention some of the key alternatives. Pitt and 
Mamdani defined a protocol-based semantics for ACLs [12]: the idea is that the 
semantics of an ACL are defined in terms of the way that they may be used in the 
context of larger structures, i.e., protocols. Singh championed the idea of social 
semantics: the idea that an ACL semantics should be understood in terms of the 
observable, verifiable changes in social state (the relationships between agents) 
that using a performative causes [16]. 

3 MABLE 

MABLE is a language intended for the design and automatic verification of multi- 
agent systems. The language was introduced in [19]; here, we give a high-level 
summary of the language, and focus in detail on features new to the language 
since [19]. 

Agents in MABLE are programmed using what is essentially a conventional 
imperative programming language, enriched with some features from agent- 
oriented programming languages such as AGENtO [15], GOLOG [10], and AgentS- 
PEAK [13]. Thus, although the control structures (iteration, sequence, and selec- 
tion) resemble (and indeed are closely modelled on) those found in languages 
such as C, agents in MABLE have a mental state, consisting of data structures 
that represent the agent’s beliefs, desires, and intentions (cf. [18]). The semantics 
of MABLE program constructs are defined with respect to the mental states of 
the agents that perform these statements. For example, when an agent executes 
an assignment operation such as 

x = 5 

then we can characterise the semantics of this operation by saying that it causes 
the agent executing the instruction to subsequently believe that the value of x 
is 5. 

In addition, MABLE systems may be augmented by the addition of formal 
claims made about the system. Claims are expressed using a (simplified) ver- 
sion of the belief-desire-intention logic COIZA [18], known as MOTZA [19]; we 
decsribe this language in more detail below. 

The MABLE language has been fully implemented. The implementation 
makes use of the SPIN system [7-9], a freely available model-checking system 
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for finite state systems. Developed at AT&T Bell Labs, SPIN has been used to 
formally verify the correctness of a wide range of finite state distributed and 
concurrent systems, from protocols for train signalling to autonomous space- 
craft control systems [8]. SPIN allows claims about a system to be expressed 
in propositional Linear Temporal Logic (ltl): spin is capable of automatically 
checking whether or not such claims are true or false. 

The MABLE compiler takes as input a MABLE system and associated claims 
(in MOIZA) about this system (see Figure 1). MABLE generates as output a 
description of the MABLE system in PROMELA, the system description language 
for finite-state systems used by the SPIN model checker, and a translation of the 
claims into the LTL form used by SPIN for model checking. SPIN can then be 
used to automatically verify the truth (or otherwise) of the claims, and simulate 
the execution of the MABLE system, using the PROMELA interpreter provided as 
part of SPIN. 

Communication in MABLE. In the version of MABLE described in [19], com- 
munication was restricted to inform and request performatives, the semantics 
of which were modelled on the corresponding FIPA performatives. However, this 
communication scheme rapidly proved to be too limiting, and has been signifi- 
cantly extended in the current version of MABLE. In particular, a user may use 
any kind of performative required: MABLE provides generic send and receive 
program instructions. 

The structure of MABLE’s message sending statement is as follows: 

send(CA j of p) 

where CA is a communicative act (i.e., a performative name), j is the intended 
recipient of the message, and tp is the content. The sender of this message is not 
represented here, but is the agent executing the statement. The basic meaning 
of the statement is that a message is sent to agent j using the communicative 
act CA: the content of the message is p. (The keyword of is syntactic sugar 
only; it can be replaced by any identifier, and has no effect on the semantics of 
the program.) 

Here is a concrete example of a MABLE send statement. 

send(inform agent2 of (a == 10)) 

This means that the sender informs agent2 that a == 10. For the moment, we 
will postpone the issue of the semantics of this statement; as we shall see below, 
it is possible for a programmer to define their own semantics, separately from 
the program itself. 

The structure of the receive statement is as follows. 

receive(CA i of p) 

As might be guessed, this means that the receiver receives a message from i for 
which the communicative act is CA and the message content is p. Communica- 
tion is synchronous in the current version of MABLE, and so for this statement 
to succeed there must be a corresponding send by agent i. 
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Fig. 1 . Operation of MABLE 
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A key component of the current instantiation of MABLE is that programmers 
can define their own semantics for communicative acts, separately from a pro- 
gram. Thus it is possible to explore the behaviour of the same program with a 
range of different semantics, and thus to investigate the implications of different 
semantics. 

The basic model we use for defining semantics is a STRiPS-style pre-/post- 
condition formalism, in the way pioneered for the semantics of speech acts by Co- 
hen and Perrault [2], and subsequently applied to the semantics of the KQML [3] 
and FIPA [4] languages. Thus, to give a semantics to performatives in MABLE, a 
user must define for every such communicative act a pre-condition and a post- 
condition. Formally, the semantics for a communicative act CA are defined as 
a pair ( CA pre , CA post ), where CA pre is a condition (a MABLE predicate), and 
CAp OS t is an assertion. The basic idea is that, when an agent executes a send 
statement with performative CA, this message will not be sent until CA pre is 
true. When an agent executes a receive statement with performative CA, then 
when the message is received, the assertion CA post will be made true. 

The MABLE compiler looks for performative semantic definitions in a file 
that is by convention named mable . sem. A mable . sem file contains a number 
of performative definitions, where each performative definition has the following 
structure: 

i: CA(j, phi) 
pre-condition 
post-condition 

where i, j and phi are the sender, recipient, and content of the message respec- 
tively, and CA is the name of the performative. The following lines define the 
pre-condition and post-condition associated with the communicative act CA. 
It is worth commenting on how these semantics are dealt with by the MABLE 
compiler when it generates PROMELA code. 

With respect to the pre-condition, the above performative definition is trans- 
lated into a PROMELA guarded command with the following structure. 

pre-condition -> send the message 

The “->” is PROMELA’s guarded command structure: to the left of -> is a condi- 
tion, and to the right is a program statement (an action). The semantics of the 
construct are that the process executing this statement will suspend (in effect, 
go to sleep) until the condition on the left hand side is true. When (more ac- 
curately, if) the condition becomes true, then the right hand side is “enabled”: 
that is, it is ready to be executed, and assuming a fair process scheduler, will 
indeed be executed. 

Notice that it is possible to define the pre-condition of a performative simply 
as “1”, i.e. , a logical constant for truth, which is always true; in this case, the 
send message part of the performative will always be enabled. 

With respect to the post-condition, MABLE translates receive messages into 
PROMELA code with the following structure: 
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receive message; 

make post-condition true 

Thus once a message is received, the post-condition will be made true. Notice 
that post-conditions in a mable.sem file do not correspond to the “rational 
effect” parts of messages in FIPA semantics [4]; we elaborate on the distinction 
below. 

Here is a concrete example of a mable . sem performative semantic definition: 

i : inf orm(j ,phi) 

1 

(believe j (intend i (believe j phi))) 

This says that the sender of a message will always send an inform message 
directly; it will not wait to check whether any condition is true. It also says that 
when an agent receives an inform message, it will subsequently believe that the 
sender intends that the receiver believes the content. 

Several examples of pre-conditions and post-conditions are given in Section 4. 
The use of semantics during the translation process is shown in Figure 2. 




Fig. 2. Operation of the MABLE system with the semantics file 
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formula ::= 

for all IDEN " : " domain formula 
| exists IDEN : domain formula 
| any primitive MABLE condition 
| ( formula ) 

| (happens Ag stmt) 

| (believe Ag formula) 

| (desire Ag formula) 

| (intend Ag formula) 

| [] formula 
| <> formula 
| formula U formula 
| ! formula 
| formula && formula 
| formula I I formula 
I formula -> formula 



/* universal quantification */ 

/* existential quantification */ 

/* primitive conditions */ 

/* parentheses */ 

/* statement is executed by agent */ 
/* agent believes formula */ 

/* agent desires formula */ 

/* agent intends formula */ 

/* always in the future */ 

/* sometime in the future */ 

/* until */ 

/* negation */ 

/* conjunction */ 

/* disjunction */ 

/* implication */ 



domain ::= 

agent /* set of all agents */ 

| NUMERIC . . NUMERIC /* number range */ 

| { IDEN, . . . , IDEN } /* a set of names */ 

Fig. 3. The syntax of MOIZA claims 



In summary, by disconnecting the semantics of a communicative act from a 
program that carries out such an act, we can experiment to see the effect that 
different kinds of semantics can have on the same agent. In the following section, 
we will see how this may be done in practice. 



Claims. A key component of MABLE is that programs may be interspersed 
with claims about the behaviour of agents, expressed in MOTZA, a subset of 
the COIZA language introduced in [18]. These claims can then be automatically 
checked, by making use of the underlying SPIN model checker. If the claim is 
disproved, then a counter example is provided, illustrating why the claim is 
false. 

A claim is introduced outside the scope of an agent, with the keyword claim 
followed by a MOTZA formula, and terminated by a semi-colon. The formal 
syntax of MOIZA claims is given in Figure 3. The language of claims is thus 
that of quantified linear temporal bdi logic, with the dynamic logic style happens 
operator, similar in intent and role to that in LOTZA [18]. Quantification is 
only allowed over finite domains, and in particular, over: agents (e.g., “every 
agent believes <p”)\ finite sets of objects (e.g., enumeration types); and integer 
number ranges. We will here give an overview of the main constructs of the claim 
language, focussing on those that are new since [19]. 

The goal of MOIZA (and also in fact of the whole MABLE framework) is 
that we should be able to verify whether programs satisfy properties of the kind 
expressed in bdi logics [14, 18]. To illustrate how MOIZA claims work, we here 
give some informal examples. 
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Consider the following COIZA formula, which says that if agent a\ believes 
the reactor failed, then a\ intends that whenever a 2 believes the reactor failed 
(i.e., a\ wants to communicate this to a 2 ). 

(Bel a\ reactorFailed ) => (Int a\ (Bel a 2 reactor Failed)) 

We can translate such a formula more or less directly into a JAOIZA claim, 
suitable for use by MABLE. Consider the following: 

claim [] 

((believe al reactorFailed) -> 

(intend al (believe a2 reactorFailed))); 

The only noticeable difference is that, in the COIZA formula, the intended in- 
terpretation is that we need to make the “whenever” explicit with the use of the 
temporal [] (“always”) connective. The following COIZA formula says that if 
some agent wants agent a 2 to believe that the reactor has failed, then eventually, 
a 2 will believe it has failed. 

Vi • (Int i (Bel a 2 reactorFailed)) => <0 > (Bel a 2 reactorFailed) 

This translates directly into the following AiOTZA claim, 
claim 

forall i : agent 

□ ((intend i (believe a2 reactorFailed)) 

-> <> (believe a2 reactorFailed)); 

Thus far, the examples we have given illustrate features that were present in the 
version of MABLE documented in [19]; we now describe the main new feature of 
MOTZA claims, modelled on COIZA' s Happens construct [18, p.62[. In COIZA, 
there is a path expression of the form 

(Happens i a) 

which intuitively means “the next thing that happens is that agent i does a”. 
Thus, for example, the following COIZA formula says that if agent a\ performs 
the action of flicking the switch, then the reactor eventually hot. 

(Happens ai flickSwitch) => <0 reactorHot 

The current version of MABLE provides such a facility. We have a A407ZA 
construct 



(happens ag stmt ) 

where ag is the name of an agent and stmt is a MABLE program statement. 
This predicate will be true in a state whenever the next statement enabled for 
execution by agent ag is stmt. Consider the following concrete example. 

claim 

□ ((happens al x = 10;) 

-> <> (believe al x==10)); 
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This claim says that, whenever the next statement to be enabled for execution 
by agent al is the assignment x=10; (notice that the semi-colon is part of the 
program statement, and must therefore be included in the happens construct), 
then eventually, al believes that variable x has the value 10. (A single equals sign 
in MABLE is an assignment, while a double equals sign is the equality predicate.) 
As we will see below, the happens construct plays a key role in our approach to 
ACL compliance verification. 

Before leaving this section, a note on how the happens construct is imple- 
mented by the MABLE compiler. The idea is to annotate the model that MABLE 
generates, with new propositions that will be set to be true in a given state 
whenever the corresponding agent is about to execute the corresponding action. 
To do this, the MABLE compiler passes over the parse tree of the MABLE pro- 
gram, looking for program statements matching those that occur in happens 
claims. Whenever it finds one, it inserts a program instruction setting the cor- 
responding new proposition to true; when the program statement is executed, 
the proposition is set to false. The toggling of the proposition value is wrapped 
within PROMELA atomic constructs, to ensure that the toggling process itself 
does not alter the control flow of the generated system. 

Although this process increases the size of the generated model, it does so 
only linearly with the number of happens constructs, and does not appear to 
affect performance significantly. Similarly, the pre-processing time required to 
insert new propositions into the model is polynomial in the size of the model 
and the number of happens claims. 



4 Verifying ACL Compliance 

We now demonstrate how MABLE can be used to verify compliance with ACL 
semantics. We begin with a running example that we will use in the following 
sections. The MABLE code is given in Figure 4. In this example, two agents have 
several beliefs and they simply send a message among themselves containing 
this belief. The selection of the message to be sent is done non deterministically 
through the choose statement. The insertion of these beliefs in agents’ mental 
state is done through the assert statements. Beliefs correspond to conditions on 
values and differ from one agent to another one. After sending messages, agents 
wait for a message from the other agent. (Due to space restrictions, we do not 
give the PROMELA code that is generated by these examples.) 



Verifying Pre-conditions. Verifying pre-conditions means verifying that 
agents satisfy the pre-condition part of an ACL performative’s semantics when- 
ever they send the corresponding message. We will focus in this paper only on 
the inform performative; the cases for request and the like are similar. 

Two approaches are possible for the pre-conditions: either agents are sincere 
(they only ever send an inform message if they believe its content), or else they 
are not (in which case they can send a message without checking to see whether 
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int s e le ct ion- agent 1 ; 
int selection-agent2 ; 
agent agent 1 { 

int inf orm-agent2; 
inf orm-agent2 = 0; 

selection-agentl = 0; 
assert ( (believe agentl (a == 10))); 
assert ( (believe agentl (b == 2))); 
assert ( (believe agentl (c == 5))); 

choose (selection-agentl , 1, 2, 3); 
if (selection-agentl == 1) { 
print ("agentl -> a = 10\n "); 
send(inform agent2 of (a == 10)); 

} 

if (selection-agentl == 2) { 
print ("agentl -> b = 2\n "); 
send(inform agent2 of (b == 2)); 

} 

if (selection-agentl == 3) { 
print ("agentl -> c = 5\n "); 
send(inform agent2 of (c == 5)); 

} 

receive (inf orm agent2 of inf orm-agent2) ; 
print ("agentl receives °/ 0 d\n ", inf orm- agent 2) 



} 

agent agent 2 { 

int inf orm-agent 1 ; 

inf orm-agentl = 0; 

selection-agent2 = 0; 

assert ( (believe agent2 (d == 3))); 

assert ( (believe agent2 (e == 1))); 

assert ( (believe agent2 (f == 7))); 

choose (selection-agent2, 1, 2, 3); 

if (selection-agent2 == 1) { 
print ("agent2 -> d = 3\n "); 
send(inform agentl of (d == 3)); 

} 

if (selection-agent2 == 2) { 
print ("agent2 -> e = l\n "); 
send(inform agentl of (e == 1)); 

} 

if (selection-agent2 == 3) { 
print ("agent2 -> f = 7\n "); 
send(inform agentl of (f == 7)); 

} 

receive (inf orm agentl of inf orm-agentl) ; 

print ("agent2 receives °/ 0 d\n" , inform-agent 1) ; 



} 



Fig. 4. The base example 
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they believe it). We can use MABLE’s acl semantics to define these two types 
of agents. Consider first the following mable.sem definition. 

i : inf orm(j ,phi) 

(believe i phi) 

(believe j (intend i (believe j phi))) 

This says that the pre-condition for an inform performative is that the agent 
believes the content (phi) of the message. By defining the semantics in this way, 
an agent will only send the message if it believes it. (If the sender never believes 
the content, then its execution is indefinitely postponed.) 

By way of contrast, consider the following mable . sem definition of the inform 
performative. 

i : inf orm( j ,phi) 

1 

(believe j (intend i (believe j phi))) 

Here, the guard to the send statement is 1, which, as in languages such as C, is 
interpreted as a logical constant for truth. Hence the guard will always succeed, 
and the message send statement will always be enabled, irrespective of whether 
or not the agent actually believes the message content. Notice that this second 
case is actually the more general one, which we would expect to find in most 
applications. 

The next stage is to consider the process of actually checking whether or not 
agents respect the semantics of the language; of course, if we enforce compliance 
by way of the mable.sem file, then we would hope that our agents will always 
satisfy the semantics. But it is of course also possible that an agent will respect 
the semantics even though they are not enforced by the definition in mable . sem. 
(Again, this is in fact the most general case.) 

For inform performatives, we can express the property to be checked in 
COIZA [18] as follows: 

A □(Happens i inform(j , <p)) => (Bel i<p) 

This formula simply says that, whenever agent i sends an “inform” message 
to agent j with content ip, then i believes <p. Now, given the enriched form of 
MABLE claims that we described above, we can directly encode this formula in 
MOIZA, as follows: 

claim 

□ 

( 

(happens agent 1 

send(inform agent2 of (a == 10));) 

-> 

(believe agentl (a == 10)) 

); 
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This claim will hold of a system if, whenever the program statement 

send(inform agent2 of (a == 10)); 

is executed by agent 1 , then in the system state from which the send statement 
is executed, agent 1 believes that a == 10. 

We can insert this claim into the system in Figure 4, and use MABLE to 
check whether it is valid. If we do this, then we find that the claim is indeed 
valid; inspection of the code suggests that this it what we expect. 

Verifying pre-conditions implies as well that we check agents do not inform 
other agents about facts that they do not believe. Given the MABLE code pre- 
sented in Figure 4, we have just to remove the line 

assert ( (believe agentl (a == 10))); 

and then set the pre-condition of the inform to 1 (i.e., true) in the mable.sem 
file, and check the previous claim. Obviously, the claim is not valid since agentl 
informs agent2 about something it does not believe. 

Verifying Rational Effects. We consider an agent to be respecting the se- 
mantics of an ACL if it satisfies the specification defined by the pre-condition 
part of a message whenever it sends the message [17]. The rational effect part of 
a performative semantics define what the sender of the message wants to achieve 
by sending it; but of course, this does not imply that sending the message is suf- 
ficient to ensure that the rational effect is achieved. This is because the agents 
that receive messages are assumed to be autonomous, exhibiting control over 
their own mental state. Nevertheless, it is useful to be able to determine in prin- 
ciple whether an agent respects the rational effect part of an ACL semantics or 
not, and this is the issue we discuss in this section. 

We will consider two cases in this section: credulous agents and sceptical 
agents. Credulous agents correspond to agents that always believe the informa- 
tion sent by other agents. We can directly define credulous agents in the following 
mable . sem file. 

i : inform( j , phi) 

(believe i phi) 

(believe j phi) 

This says that the recipient (j) of an inform message will always come to believe 
the contents of an inform message. 

Sceptical agents are those that believe that the sender intends that they 
believe the information; they do not necessarily come to directly believe the 
contents of the message. 

i:inform(j, phi) 

(believe i phi) 

(believe j (intend i (believe j phi))) 

We can directly define a A401ZA claim to determine whether or not an agent 
that is sent a message eventually comes to believe it. 
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claim [] 

( 

(happens agent 1 

send(inform agent2 of (a == 10));) 

-> 

<> (believe agent2 (a == 10)) 

); 

This claim is clearly valid for credulous agents, as defined in the mable . sem file 
given above; running MABLE with the example system immediately confirms 
this. 

Of course, the claim may also be true for sceptical agents, depending on 
how their program is defined. We can directly check whether or not a particular 
sceptical agent comes to believe the message it has been sent, with the following 
claim: 

claim 

□ 

( 

(believe agent2 
(intend agent 1 

(believe agent2 (a == 10)))) 

-> 

<> (believe agent2 (a == 10)) 

); 



5 Conclusion 

We have described extensions to the MABLE multiagent programming language 
and its associated logical claim language that make it possible to verify whether 
MABLE agents satisfy the semantics of ACLs. We illustrated the approach with 
a number of case studies. A key issue for future work is that of moving from 
the design level (which is what MABLE represents) to the implementation level, 
in the form of, for example, JAVA code. One possibility we are currently inves- 
tigating is to enable MABLE to automatically generate JAVA code once a design 
has been satisfactorily debugged. Another interesting avenue for future work is 
investigating whether the MABLE framework might be used in the verification 
of other ACL semantics, such as Pitt’s protocol-based semantics [12], or Singh’s 
social semantics [16]. 
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Abstract. An agent communication protocol specifies the rules of in- 
teraction governing a dialogue between agents in a multiagent system. In 
non-cooperative interactions (such as negotiation dialogues) occurring in 
open societies, the problem of checking an agent’s conformance to such 
a protocol is a central issue. We identify different levels of conformance 
(weak, exhaustive, and robust conformance) and explore, for a specific 
class of logic-based agents and an appropriate class of protocols, how to 
check an agent’s conformance to a protocol a priori , purely on the basis 
of the agent’s specification. 



1 Introduction 

Protocols play a central role in agent communication. A protocol specifies the 
rules of interaction between two or more communicating agents by restricting 
the range of allowed follow-up utterances for each agent at any stage during a 
communicative interaction (dialogue). Such a protocol may be imposed by the 
designer of a particular system or it may have been agreed upon by the agents 
taking part in a particular communicative interaction before that interaction 
takes place. 

Protocol are public, i.e. they are (at least in principle) known to all partic- 
ipating agents (and possibly also to any outside observers). As several authors 
have pointed out, some form of public protocol is a necessary requirement for 
the definition of a suitable semantics of an agent communication language [13, 
16]. Without public conventions (as specified by a protocol), it would not be 
possible to assign meaning to an agent’s utterances. 

This “ conventionalist'’ view stands in contrast to the “mentalistic” approach 
taken, for instance, in the definition of FIPA-ACL [8], where the legality of ut- 
terances is specified in terms of the mental states of the agents participating in 
an interaction. This is not to say that an agent’s mental state is not relevant to 
communication. On the contrary, an agent’s goals and beliefs will strongly in- 
fluence that agent’s communicative behaviour. However, these mental attitudes 
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should not have to be taken into account when we define what constitutes a legal 
utterance at a given point in time. We therefore distinguish an agent’s commu- 
nication strategy , which may be private and will be determined by the agent’s 
goals and beliefs, from the public protocol which lays down the conventions of 
communication in a given multiagent system in terms of publicly observable 
events (i.e., in particular, previous utterances) alone. 

By the very nature of protocols as public conventions, it is desirable to use a 
formal language to represent them. In particular, when used in connection with 
agents that are specified or even implemented using some form of executable 
logic, a logic-based representation language for communication protocols has 
many advantages. This paper summarises some of our recent work in the field of 
logic-based agent communication protocols, which has been initiated in [4] and 
further developed in [5] . It extends [4, 5] in that it discusses possible avenues for 
future work and it provides a more detailed comparison between our approach 
and related work in the area. In particular, Sections 5 and 6 are new. 

Paper Overview. In Section 2, we discuss two options for representing interac- 
tion protocols: finite automata and our logic-based language. In Section 3 we 
then introduce three different levels of conformance to such a protocol: weak, 
exhaustive, and robust conformance. In Section 4 we show that our logic-based 
representation greatly facilities checking whether a given agent is guaranteed 
to always behave in conformance to a particular protocol. We briefly recall the 
definition of a class of logic-based agents introduced in [14] and present suffi- 
cient criteria for such agents to be either weakly or exhaustively conformant to 
a protocol. Section 5 discusses potential generalisations of our protocol language 
and Section 6 discusses related work. Section 7 concludes. 

2 Representing Agent Communication Protocols 

In this section we present a logic-based representation formalism for a simple yet 
expressive class of interaction protocols. We assume some restrictions on the kind 
of interactions that we want to model. The dialogues we consider only involve 
two agents which sequentially alternate dialogue moves (utterances). These re- 
strictions (notably avoiding concurrency) allow us to concentrate on a particular 
class of protocols, namely those representable by means of deterministic finite 
automata (DFAs), of which there are numerous examples in the literature. 

Deterministic Finite Automata- Based Protocols. A DFA consists of (i) a set of 
states (including an initial state and a set of final states), (ii) an input alphabet, 
and (iii) a transition function 6 which maps pairs of states and elements of 
the input alphabet to states [12]. In the context of communication protocols, 
elements of the input alphabet are dialogue moves and states are the possible 
stages of the interaction. 

A protocol based on such a DFA representation determines a class of well- 
formed dialogues where each and every dialogue move is a legal continuation of 
the interaction that has taken place so far: 
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Given a protocol based on a DFA with transition function S, a dialogue 
move P is a legal continuation (or simply a legal move) with respect to 
a state S iff there exists a state S' such that S' = S(S, P). 

Example. Fig. 1 shows a simple example of a DFA-based protocol which regulates 
a class of negotiation dialogues between two agents A and B. It specifies that 
after a request made by A (in an initial state 0), B may either accept that request, 
refuse it, or choose to challenge it. Either one of the first two options would end 
the negotiation dialogue (bringing the dialogue into a final state). In case agent 
B decides to challenge , agent A has to reply with a justify move, which takes 
the dialogue back to the state where B can either accept, refuse, or challenge. 

Logic-Based Protocols. Protocols such as that of Fig. 1 can alternatively be 
represented as sets of if-then-rules which specify the set of correct responses for 
a particular incoming dialogue move. For example, to express that agent B could 
react to a request move sent by A either by accepting, refusing, or challenging 
the request, we may use the following if-then rule: 

tell( A, B, request, D,T) => tell(B, A, accept, D,T+1) V 

tell(B, A, refuse, D,T+ 1) V 
tell(B, A, challenge, D,T+ 1) 

Here, the variable B refers to the name of the agent whose communicative be- 
haviour is restricted by the rule and A represents the agent sending a message to 
B. Note that variables in the above formula are implicitly universally quantified 
over the whole formula 1 . In general, in this logic-based representation, dialogue 
moves are instances of the following schema: 

tell(X, Y, Subject, D, T) 

1 In our chosen representation, in cases where there are variables that only appear on 
the righthand side of an implication, these variables are understood to be existen- 
tially quantified. 
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Here, X is the utterer, Y is the receiver (X ^ Y), D is the identifier of the 
dialogue 2 , and T the time when the move is uttered. Subject is the type of 
the dialogue move, i.e. a performative (such as request ) of the communication 
language, possibly together with a content (as in requestjitemn)) . For most of 
this paper, we are going to omit the content of dialogue moves, as it is usually 
not relevant to the definition of legality conditions for automata-based protocols 
and similar formalisms. Also, we shall mostly use the abbreviated form P(T) for 
dialogue moves (where P stands for the performative of the respective move and 
T stands for the time of the move), thereby omitting the parameters not relevant 
to our discussion. For example, our earlier concrete rule could be represented in 
short as 

request(T) => accept(T+ 1) V 
refuse(T+ 1) V 
challenge(T+l) 

For the sake of simplicity, we will assume that the start of the protocol is triggered 
by some external event START - it is possible to conceive this as the result of 
some meta-level negotiation process to agree on a particular protocol. The start 
signal START(X,Y, D,T) is sent by the system to agent Y to sanction at time 
T the beginning of a dialogue with identifier D amongst agent Y and agent X. 
We will assume that the system sends such a signal exactly once and exactly to 
one agent during a dialogue. We will also assume that each time this signal is 
sent to an agent, it has a new dialogue identifier. Similarly, a dialogue ends once 
one of the agents sends the signal STOP to the system. STOP(X,Y, D,T) is 
sent by agent X to the system at time T to sanction the end of a dialogue with 
identifier D between X and Y. Dialogue inputs for an agent are either dialogue 
moves sent by other agents or a START signal sent by the system. 

Going back to the example of Fig. 1, we observe that this automaton in fact 
represents two subprotocols, one for the initiator of a dialogue, and one for its 
partner (naturally, each agent can serve as initiator and as partner of a dialogue, 
at different times). We will refer to these two subprotocols as V t and V p . They 
can be translated into a set (composed of two subsets) of if-then-rules. Here is 
the subprotocol for the initiator: 

START(X, Y, D, T) =+ tell(Y, X , request , D, T+ 1) 
t.ell{X, Y, accept, D, T) =+ STOP{Y , X, D , T+l) 

i : tell(X, Y, refuse, D, T) =+ STOP{Y, X, D, T+l) 

tell(X, Y, challenge, D, T) => tell(Y, X, justify, D, T+l) 

Note that agent Y does not have any choices when replying to messages received 
from agent X (at least not as far as the performatives of the dialogue moves are 
concerned). Also note that agent Y is responsible for transmitting the STOP 
signal to the system after having received either one of the terminating moves 
from X. The subprotocol for the partner in a dialogue consists of two rules: 



2 In general, the identifier might be a function of a protocol name. 
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tell{X, Y, request, D,T) => tell(Y, X, accept, D,T+ 1) V 

tell(Y, X, refuse, D,T+ 1) V 
tell(Y,X, challenge, D,T+ 1) 

^ >p tell(X,Y, justify, D,T) telKY, X, accept, D,T+1) V 

tell(Y, X, refuse, D,T+ 1) V 
tell(Y, X, challenge, D,T+ 1) 

Note that agent Y have multiple choices when replying to messages received 
from agent X . 

Shallowness. In our example we have simply translated an automata-based pro- 
tocol into if-then-rules where we have a single performative on the lefthand side. 
We call protocols that permit such a straightforward translation shallow. Shal- 
low protocols correspond to DFAs where it is possible to determine the next 
state of the dialogue on the sole basis of the previous move. Of course, this is 
not always the case for all protocols, since in some protocols it may be necessary 
to refer to the current state of the dialogue to determine the new state (think of 
two transitions with the same label leaving two different states and leading to 
two different states) . 

In principle, any automata-based protocol can be transformed into a protocol 
that is shallow in this sense (by simply renaming any duplicate transitions). In 
fact, many of the automata-based protocols proposed in the multiagent systems 
literature happen to be shallow already or could at least be made shallow by 
renaming only a small number of transitions. 

Well-Formedness Requirements. In the light of the above remarks, we will gen- 
erally represent shallow protocols as two sets of rules of the following form: 

P(T) => P[(T+1) V P^T+l) V • • • V P' k (T+l) 

We will call these rules protocol rules. The rightlrand side of a protocol rule 
defines the possible continuations with respect to the protocol after the input 
P (which we will sometimes refer to as the trigger of the rule). The set of all 
triggers appearing in a subprotocol for a given agent is called the set of expected 
inputs for that agent. To ensure that this protocol is well-formed we will require 
that the two sets of rules meet a number of requirements (R1-R5): 

— (Rl, initiation ): there is at least one rule with START on the lefthand side 
in the protocol, and START may never occur on the rightlrand side of a rule; 

— (R2, matching ): any dialogue move except STOP occurring on the rightlrand 
side of one subprotocol also occurs on the lefthand side of the other, and vice 
versa; 

— (R3, non- concurrency ): every subprotocol includes the following additional 
rule to avoid concurrent moves (_L stands for false): 



tell(X, Y, S\,T, D) A tell{X, Y, S 2 ,T,D) A Sr £ S 2 =► _L; 
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— (R4, alternating): for each rule occurring in a subprotocol, if X is the receiver 
and Y the utterer of the dialogue move occurring on the lefthand side, it 
must be the case that X is the utterer and Y the receiver of every dialogue 
move occurring on the riglrtlrand side (except for START and STOP)', 

— (R5, distinct triggers): in each subprotocol, all dialogue moves occurring on 
the lefthand side of the rules are distinct from each other. 

We note here that these are very simple requirements. Any protocol that is well- 
formed in this sense will provide a complete description of what constitutes a 
sequence of legal dialogue moves. R2 (matching) is the central requirement here; 
it ensures that for any move that is itself a legal continuation of the dialogue 
that has taken place so far, there will be a protocol rule that determines the 
range of legal follow-ups for that move. Ensuring this property in non-shallow 
protocols is more complicated as we shall see in Section 5. 

3 Levels of Conformance 

Broadly speaking, an agent is conformant to a given protocol if its behaviour is 
legal with respect to that protocol. We have found it useful to distinguish three 
levels of conformance to a protocol, which we are going to discuss next. Note that 
we are going to define these notions on the basis of the observable conversational 
behaviour of the agents (i.e. what they utter or not) alone, without making 
further assumptions on how they actually come to generate these utterances. 

Weak Conformance. We start with the notion of weak conformance: 

An agent is weakly conformant to a protocol V iff it never utters any 
dialogue move which is not a legal continuation (with respect to V) of 
any state of the dialogue the agent might be in. 

It is clear that any application governed by a protocol at least requires the level 
of weak conformance - otherwise it would not make sense to define a protocol 
in the first place. This is true at least if, as in this paper, we perceive protocols 
as (syntactic) rules that define the legality of an utterance as a follow-up in 
a given dialogue. If we adopt a broader notion of protocols, however, levels of 
conformance that are less restrictive than our weak conformance may also be 
considered. Yolum and Singh [18], for instance, advocate a flexible approach to 
interaction protocols in which agents may skip steps in a protocol as long as this 
does not render the interaction as a whole meaningless. Such protocols cannot 
be specified purely syntactically any more, but have to capture the “intrinsic 
meanings of actions” [18] for us to be able to decide which “shortcuts” are 
admissible and which are not. 

Exhaustive Conformance. The notion of weak conformance captures that the 
agent does not utter any illegal moves, but does not actually require that the 
agent utters any dialogue move at all. For interactions where “silent moves” are 
undesirable, a stronger version of conformance is usually required. We make this 
idea precise with the notion of exhaustive conformance: 
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An agent is exhaustively conformant to a protocol V iff it is weakly con- 
formant to V and it will utter at least one dialogue move which is a legal 
continuation of any legal input of V it receives. 

Exhaustive conformance is certainly what is intuitively expected in most inter- 
actions - it is indeed often preferred to avoid considering silent moves as part 
of a protocol, at least to avoid confusion with lost messages. One may then ar- 
gue that exhaustive conformance should be the minimum requirement for any 
interaction. 

We believe, however, it is worth making the distinction between weak and ex- 
haustive conformance. The first reason is that there are examples where the lack 
of response can be considered to be part of the protocol. In such circumstances, 
it can be sufficient to design a weakly conformant agent, provided that silent 
moves will not have undesirable consequences. For instance, in a Dutch auction 
process “when there is no signal of acceptance from the other parties in the auc- 
tion (other agents in the negotiation) the auctioneer makes a new offer which he 
believes more acceptable (by reducing the price). Here, because of the convention 
(protocol) under which the auction operates, a lack of response is sufficient feed- 
back for the auctioneer to infer a lack of acceptance.” [10]. In this case, the agent 
can safely be designed to react appropriately only to the proposals it is ready 
to accept. But if we consider recent argumentation-based protocols inspired by 
dialectical models it is sometimes assumed that “silence means consent” [2]. In 
this case, a lack of response can commit the receiver to some propositions - this 
is a typical case where it is crucial that agents are exhaustively conformant. The 
second reason for our distinction of weak and exhaustive conformance is that 
they are conceptually different since weak conformance only involves not utter- 
ing (any illegal moves), while exhaustive conformance involves uttering (some 
legal move). This implies substantially different approaches when the issues of 
checking and enforcing conformance are raised, as we shall see later. 

Robust Conformance. Another important problem of agent communication is 
the need to deal with illegal incoming messages, and to react appropriately to 
recover from such violations. For instance, any FlPA-compliant communicative 
agent has to integrate a performative not-understood as part of its language [8]. 
This motivates us to introduce the following notion of robust conformance: 

An agent is robustly conformant to a protocol V iff it is exhaustively 
conformant to V and for any illegal input move received from the other 
agent it will utter a special dialogue move (such as not-understood,) 
indicating this violation. 

Robust conformance goes a step further than exhaustive conformance since it 
requires that an appropriate response is uttered also in reply to illegal input 
moves. Technically, this necessitates that the agent is able to identify the legality 
of an incoming dialogue move, i.e. it needs to be able to check conformance with 
respect to the other agent’s subprotocol. 
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Note also that in the case where all agents in the society are known to 
be weakly conformant, it is theoretically unnecessary to deal with robust con- 
formance (since no agent will ever utter an illegal move). The same applies to 
systems that are “policed” in the sense that messages not conforming to the pro- 
tocol will simply not be delivered to the intended recipient. Such an assumption 
would, however, somewhat contradict the “spirit” of an open society. We should 
also point out that in dialogues with a very high contingent of illegal utterances 
the additional not-understood moves may in fact burden communication chan- 
nels unnecessarily, and, therefore, simply ignoring illegal moves would in fact be 
a better strategy. 

4 Checking Conformance 

When checking an agent’s conformance to a publicly agreed interaction protocol 
we can distinguish two cases: checking conformance at runtime and checking con- 
formance a priori 3 . The former means checking the legality of the moves as they 
occur in a dialogue. This would enable a society of agents or a particular agent 
to determine the legality of an observed dialogue move. Checking conformance 
a priori means checking the legality of an agent’s communicative behaviour on 
the basis of its specification. In other words, a priori conformance allows us to 
guarantee in advance that a computee will be conformant to a given protocol. 

In general, checking a priori whether an agent will always behave in con- 
formance to a given set of protocols is difficult, if not impossible. Firstly, the 
privacy requirement of a society of agents makes it problematic for the society 
to access the agent’s private specification, and secondly the complexity of the 
specifications makes it hard - even when access to that specification is granted 
(for the agent itself, for instance) - to actually decide whether the agent will 
be conformant or not. In particular, the behaviour of the agent will typically 
depend on some hardly tractable notions, such as beliefs and intentions. As we 
shall see in this section, for a particular class of logic-based agents and for our 
shallow protocols we can overcome these difficulties, at least in the case of weak 
conformance. We are also going to discuss how to extend these results to check- 
ing exhaustive conformance, although - as far as our privacy requirements are 
concerned - our results will necessarily be less satisfying in this case. 

Logic-Based, Agents. We are now going to consider the case of a specific class of 
agents based on abductive logic programming that have recently been used in 
the context of negotiation scenarios [14] . The communication strategy S of such 
an agent (which forms part of its knowledge base /C) is represented as a set of 
if-then rules of the following form: 

P(T) A C => P'{T+ 1) 

On receiving dialogue move P at time T 4 , an agent implementing this rule 
would utter P 1 at time T+l, provided condition C is entailed by its ( private ) 

3 Guerin and Pitt [9] refer to the latter as compliance at design time. 

4 As earlier in the paper, we represent dialogue moves in short simply by referring to 
the performative in the move and to the time of the move. 
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knowledge base. Again, variables are understood to be implicitly quantified in 
the same way as our protocol-rules. The dialogue moves P and P' will be based 
on the agent’s communication language. Below, we refer to if-then rules as the 
above as strategy rules. 

Response Spaces. In preparation for defining a suitable criterion for guaranteed 
weak conformance to a given protocol, we introduce the notion of response space 
for a logic-based agent. Intuitively, the response space of an agent specifies the 
possible moves that the agent can make when using a given strategy <S, without 
considering the specific conditions relating to its private knowledge base. This 
abstraction from an agent’s communicative behaviour is related to the idea of 
an agent automaton proposed by Singh [15]. 

The response space S* of an agent with strategy S based on the communi- 
cation language C is defined as follows: 

{P(T)=> {P'(T+1) | [P(T)AC=>P'(T+1)] eS}|Pe£} with {} = T 

That is, the response space is, essentially, the set of protocol rules we get by 
first dropping all private conditions C and then conjoining implications with 
identical antecedents by collecting the corresponding consequents into a single 
disjunction. For example, the strategy 

S = {request (T) A happy => accept (T+ 1), 

request (T) A unhappy => refuse(T+ 1)} 

determines the following response space: 

S* = { request(T ) => acceptfT- 1-1) V refuse(T+ 1)} 

Checking Weak Conformance. We are now going to state a simple criterion that 
offers an elegant way of checking weak conformance a priori for a logic-based 
agent. In particular, it avoids dealing with the dialogue history, and it does not 
make any assumptions on the content of the agent’s knowledge base (except to 
require that it is possible to extract the response space, as previously described). 
The following is a sufficient criterion for weak conformance: 

An agent is weakly conformant to a protocol V whenever that protocol is 
a logical consequence of the agent’s response space. 

This result is proved in [5]. Observe, however, this is not a necessary criterion 
for weak conformance, because, looking at the form of strategies, it is clear that 
private conditions may prevent the agent from uttering a particular dialogue 
move. In other words, it could be the case that S* does not entail V but that 
the agent is still weakly conformant to V because of its specific knowledge base. 

The above result shows that, in the case of weak conformance, it is possible 
to check conformance a priori by inspecting only a relatively small part of an 
agent’s specification (namely, what we could call its “communication module” or 
communication strategy). In particular, we are not required to make any judge- 
ments based on the content of its (probably dynamically changing) knowledge 
base in general. 
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Checking Exhaustive Conformance. In the case of exhaustive conformance, the 
situation is rather different. To understand why, recall that as well as requiring 
weak conformance, exhaustive conformance requires the property of uttering at 
least one legal move for any legal input. The latter property, which we shall 
simply refer to as exhaustiveness (of an agent) may be considered independently 
from a particular protocol. In [14], for instance, the authors define a notion of 
exhaustiveness with respect to a given communication language (as being able to 
utter a response for any incoming move belonging to that language) . Even for our 
agents, whose communicative behaviour is determined by if-then rules of the form 
P(T)AC => P'(T+1), it is not generally possible to guarantee exhaustiveness (be 
it with respect to a given protocol, language, or in general). We cannot generally 
ensure that one of these rules will indeed “fire” for an incoming move P(T), 
because none of the additional conditions C may be entailed by the current 
state of the agent’s knowledge base. 

As shown in [3], one way of ensuring exhaustive conformance would be to rely 
on logical truths that are independent from the (possibly dynamic) knowledge 
base of the agent. For a strategy S and any performative P in a given com- 
munication language, let CONDs(P) denote the disjunction of all the private 
conditions that appear in a strategy rule in S together with the trigger P(T), 
i.e.: 



C0ND 5 (P) = {C | [P(T) A C=> P'{T+ 1)] G S} with { } = _L 

Now, if CONDs(P) is a tautology for every performative P appearing on the 
lefthand side of the relevant subprotocol of a protocol P, then any agent imple- 
menting the strategy S is guaranteed to utter some move for any input expected 
in V . Hence, we obtain a useful sufficient criterion for exhaustive conformance 
(again, with respect to our shallow protocols): 

An agent with strategy S is exhaustively conformant to a protocol V 
whenever it is weakly conformant to V and CONDs(P) is a tautology 
for every expected input P (for that agent, with respect to V). 

Of course, generally speaking, checking this condition is an undecidable problem 
because verifying theoremlrood in first-order logic is. In practice, however, we 
would not expect this to be an issue given the simplicity of typical cases. As 
an example, consider a protocol consisting of only the following rule stipulating 
that any request by another agent X should be either accepted or refused: 

request(X,T) => accept(T +1) V refuse(T+ 1) 

An agent may implement the following simple strategy: 

request(X,T) A friend(X) => accept(T+ 1) 
request(X,T) A ->friend(X) => refuse(T+ 1) 



The disjunction -i friend(X) V friend(X), with X being implicitly universally 
quantified, is a theorem. Hence, our agent would be exhaustively conformant 
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(note that the agent is certainly going to be weakly conformant, because the 
protocol is a consequence of its response space - in fact, the two are actually 
identical here) . A similar idea is also present in [14] , although not in the context 
of issues pertaining to protocol conformance. Fulfilling the above criterion is not 
an unreasonable requirement for a well-designed communication strategy S that 
is intended to be used for interactions governed by a given protocol V. 

We continue our discussion of exhaustiveness by observing that, in cases 
where we can identify a static part of an agent’s knowledge base (beyond the 
set of rules making up its communication strategy), we can give an even more 
general sufficiency criterion that guarantees exhaustive conformance: 

An agent with strategy S is exhaustively conformant to a protocol V 
whenever it is weakly conformant to V and COND,s(P) is a logical con- 
sequence of the static part of the agent’s knowledge base for every expected 
input P. 

To illustrate the idea, we slightly change our earlier example and replace the 
agent’s second strategy rule with the following strategy rule: 

request(X,T) A enemy(X) => refuse(T+ 1) 

That is, our agent will refuse any request by X if it considers X to be an enemy. 
Now our first criterion does not apply anymore; we cannot ensure exhaustive 
conformance. However, if the agent’s knowledge base includes a formula such as 
^enemy(X) =>■ friend(X), expressing that anyone who is not an enemy should be 
considered a friend, then we can show that friend(X) V enemy(X) is a logical con- 
sequence of that knowledge base and, thereby, that our agent will be exhaustively 
conformant to the protocol. Note that this agent may generate two responses for 
a single input, namely in cases where both friend(X) and enemy(X) are true, 
which would conflict with the non-concurrency requirement of our protocols (see 
well-formedness requirement R3). 

Enforcing Conformance. Finally, even when an agent cannot be shown to be 
conformant a priori , it may still be possible to constrain its behaviour at runtime 
by simply forcing it to comply to the rules of the protocol. The problem of 
enforcing conformance (referred to as regimentation by Jones and Sergot [11]) is 
then to try to find easy (and hopefully automatic) ways to ensure that an agent 
will always be conformant to a given protocol. As shown in [5], for any shallow 
protocol V, a logic-based agent generating its dialogue moves from a knowledge 
base of the form K, UP will be weakly conformant to V. 

That is, the agent could simply “download” the appropriate protocol when 
entering a society and thereby guarantee conformance (and avoid possible penal- 
ties) without requiring any additional reasoning machinery. The intuition behind 
the proof of this result is that the additional protocol rules given by V (together 
with the non-concurrency rule of well-formedness requirement R3) would render 
any branches in the agent’s internal derivation tree corresponding to illegal di- 
alogue moves inconsistent and thereby actively prevent the agent from uttering 
such moves. 
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Fig. 2. A protocol that is problematic for n-triggers rules 



5 Beyond Shallow Protocols 

In this section, we are going to highlight a number of ways in which our logic- 
based representation language for protocols may be extended to describe a wider 
class of communication protocols. 

Protocols with Several Triggers. Shallow protocols can be seen as a special case 
of what could be called n-triggers protocols, which can be represented by if-then 
rules whose lefthand side may refer to any of the n previous utterances (both the 
agent’s and its partner’s) rather than just the very last utterance. Such if-then 
rules (referred to below as n-trigger protocol rules have the form: 

Pi(T— 1) V • • • V P n (T—n) => P[(T)\J ■■■\J P' k {T) 

For this class of protocols, we may or may not require a trigger to be present 
for every time point from T—n to T— 1. The latter seems to be more convenient 
for most examples. However, if the range of time points referred to in the list 
of triggers on the lefthand side is not always the same, then it becomes more 
difficult to formulate appropriate well-formedness conditions for this class of 
protocols. It is more complicated to determine if a given set of protocol rules is 
contradictory in the sense of forcing different responses in the same situation, 
and whether it is complete, in the sense of providing a rule for every possible 
state a dialogue following these rules may end up in. 

Using n-trigger protocol rules, we can describe protocols where the last n 
moves matter. However, a DFA can express more complicated features. To il- 
lustrate this point, consider the automaton of Fig. 2. This is an example for a 
two phase negotiation protocol. Starting in state 0, agent A sends a first request 
which takes us into state 1. In this state, agent B may challenge the other agent, 
accept or refuse the request, or use its right to veto. In the case of a challenge, 
we enter a justification loop until either A retracts the request or B stops chal- 
lenging. In state 6 the situation is very similar to state 1, but in state 6 agent 
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B is not allowed to veto , nor is it allowed to refuse the request (but it can 
still challenge or accept the request). So it is a bit of a gamble for B to veto 
A’s request and ask for a new request by A, as B can never refuse or veto the 
new request. Now, after receiving either a request or a justify move, it is crucial 
to be able to determine whether one is in state 1 or in state 6, because these 

states are different, as we have just seen. Intuitively, it is easy to see that in 

order to be in the state 6, B must have sent a veto move at some point during 
the dialogue. But because of the loops, it is not possible to specify when this 
dialogue move has occurred. So, this protocol cannot be represented as a set of 
n-trigger protocol rules. 

If we allow for arbitrary time references (rather than merely specific time 
points in the past) together with temporal constraints on the lefthand side of a 
rule then we can express that veto must have occurred at some point. In such 
an enriched language, we would be able to write protocol rules for state 6: 

request[T — 1) A veto(T') A (T 1 < T) => acceptfT) V challenge(T) 

justify(T—l) A veto{T') A (T' < T) => acceptfT) V challengefT) 

However, writing similar rules for state 1 is still not possible, as we cannot express 
that veto has not occurred in the past. 

Of course, we could further enrich our language and allow for negation and 
explicit quantification in protocol constraints to also capture this kind of situ- 
ation. However, the more we enrich our language the more difficult it will be 
to actually work with protocols expressed in that language. One major problem 
would be to formulate appropriate well-formedness conditions to ensure that 
protocols are non-contradictory and complete. 

Logical State-Based, Representation of DFAs. An alternative avenue of research 
would be to adopt a state-based representation in logic, encoding straightfor- 
wardly DFAs into two kinds of rules: state maintenance rules representing the 
transition function (that is, given a state and a move, determining the next 
state), of the form 



state(N,T) A P(T) => state(N',T+ 1) 

and legality rules determining the set of legal continuations from this state of 
the dialogue, of the form 



state(N,T ) => PfT+l) 

iei 

Conceptually however, if we want to be able to check conformance using the 
techniques described earlier in this paper, all agents should be required to express 
their own communication strategies in terms of the same DFA. This, we believe, 
is quite a strong restriction that we would like to avoid in open agent societies 
(in comparison, the only assumption so far has been that strategy rules refer at 
least to the latest move on the lefthand side of the protocol rules). 
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Other Extensions. The protocols we have discussed so far all refer only to the 
performatives (such as request ) of utterances when evaluating legality conditions. 
In some situations, we may also wish to refer to the content of an utterance (for 
instance, to prevent an agent from proposing a price lower than the previous 
proposed price in the course of an auction, to deal with deadlines, etc.). This con- 
tent is expressed in a content language which may be infinite. To permit explicit 
reference to this content, protocols should be augmented with data structures 
allowing to keep track of this information (or at least of those contents deemed 
relevant to contrain the legal follow-ups of the interaction). Examples of such 
protocols can be found in argumentation-based approach to agent interaction 
[2], where it is often assumed that agents have access to so-called commitment- 
stores to record each others arguments throughout the dialogue. In general, such 
protocols will be more expressive than the protocols described via DFAs 5 . 

In [7], a number of abstract models for protocols where content items influ- 
ence the range of legal follow-ups have been studied in terms of abstract machine 
models (such as pushdown automata). It would be interesting to combine these 
ideas with our approach and to design logic-based protocol representation lan- 
guages that allow for the explicit reference to an utterance’s content. 



6 Related Work and Discussion 

In this section we briefly discuss two related approaches to logic-based protocols 
for agent communication, namely the “social integrity constraints” of Alberti et 
al. [1] and the set-based protocol description language for the specification of 
logic-based electronic institutions introduced by Vasconcelos [17]. 

Social Integrity Constraints. Alberti et al. [1] put forward a logic-based repre- 
sentation formalism for communication protocols that is closely related to ours. 
Their “social integrity constraints” are similar to our protocol rules; however, 
they explicitly introduce operators to refer to events (such as agents uttering 
particular dialogue moves) that happen in an agent society and those that are 
expected to happen in the future (or, indeed, that should have happened in the 
past). In our system, these notions are implicit: the lefthand side of a protocol 
rule such as request(T ) => acceptiT- 1-1) V refuse(T+l) refers to events that have 
just happened, while those on the righthand side specify the expectations put 
on the agent whose turn it is next. 

Another difference is that the integrity constraints of Alberti et al. may in- 
clude temporal constraints over the time parameters of the social events (i.e. 
dialogue moves) occurring in a rule. This allows for the representation of a wider 
class of protocols than just shallow protocols. On the downside, this added ex- 
pressive power makes the design of well-formed protocols more difficult a task. 

5 Note however that adding only a set recording contents of a finite language is not 
more expressive than a DFA (even if, practically, it can soon turn out to be tedious 
to use the automaton- like design). 
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Given a set of protocol rules in a rich description language, how do we check 
whether this protocol really covers every possible dialogue state 6 ? 

Similarly, the more expressive a protocol representation language, the more 
difficult will it be to check an agent’s conformance to such a protocol. While in 
the case of automata-based protocols (and certainly shallow protocols) checking 
conformance at runtime is essentially a trivial problem, this becomes a major 
issue for systems where the legality of a given move at a given time cannot be 
decided once and for all at the time it occurs. Indeed, much of [1] is devoted to 
this issue for the case of social integrity constraints. Of course, checking confor- 
mance a priori is considerably more difficult. It firstly requires an abstraction 
from the agent’s specification, expressed in the same language used to express 
protocols (in the case of our system, this abstraction is given by the very simple 
notion of response space). This abstraction then needs to be related to the actual 
protocol in order to define a suitable criterion for guaranteed conformance. 

Logic-Based, Electronic Institutions. Vasconcelos [17] puts forward another logic- 
based system for specifying protocols in the context of developing electronic 
institutions [6]. Electronic institutions are an attempt to provide an “institu- 
tional” framework for the development of open agent societies. This includes, 
in particular, the provision of institutional rules that govern dialogue between 
agents inhabiting such a society, i.e. communication protocols. 

The language proposed in [17] is based on first-order logic, but enriched with 
sets. This allows for the representation of relevant events in the past, which may 
influence the legality of follow-up moves. From a computation-theoretic point of 
view, Vasconcelos ’system is related to the class of “protocols with a blackboard” 
described in [7]. As discussed before, protocols of this class extend automata- 
based protocols by allowing for utterances to be stored in a set (the blackboard) 
and for legality conditions that make reference to the elements of that set. 



7 Conclusion 

We have discussed conformance as the basic notion of ensuring that the be- 
haviour of an agent is adapted to a public protocol regulating the interaction in 
a multiagent system. Our approach starts from on an alternative representation 
formalism for communication protocols based on if-then-rules for the kinds of 
protocols that can be represented as deterministic finite automata. In particular, 
we have restricted ourselves to a class of protocols where it is not necessary to 
consider the history of the dialogue besides the latest move to determine the 
possible legal dialogue continuations (shallowness). We have then discussed the 

6 It should be noted that, in fact, this is only a problematic issue if one requires a 
protocol to provide a complete specification of what constitutes legal behaviour in a 
communicative interaction. While we take the view that a protocol should provide 
such a complete specification in order to facilitate a complete interpretation of the 
dialogues actually taking place, Alberti et al. “tend to constrain agents’ interaction 
a little as possible, only when needed” [1]. 




106 



Ulle Endriss et al. 



distinction of three levels of conformance: weak conformance which requires that 
an agent never makes an illegal move, exhaustive conformance which in addition 
requires an agent to actually make a (legal) move whenever it is his turn in a 
dialogue, and robust conformance which also requires an appropriate reaction 
to illegal incoming moves. In the case of weak and exhaustive conformance, we 
have provided sufficient criteria that can be used to determine whether an agent 
can be guaranteed to behave in conformance to a given protocol. 

Competence. In this paper we have studied the concept of conformance to a 
communication protocol, which is undoubtedly one of the very central notion 
to be considered when working with protocols. However, the ability to merely 
conform to a protocol is not not sufficient to be a competent user of that protocol. 
Intuitively, we understand competence with respect to a protocol as the capacity 
of an agent to “deal adequately” with that protocol. Take, for instance, our 
negotiation protocol shown in Figure 1. Now assume an agent (supposed to take 
the role of agent B in that protocol) is engaging in a dialogue regulated by this 
protocol using the following response space: 

S* = { requestfT ) =>■ refuse(T+ 1), 
justify(T) => refuse(T+ 1)} 

Even if this agent was exhaustively conformant to the protocol, it would intu- 
itively not be competent as it could never reach either state 2 or 3 (and con- 
sequently the interaction could never terminate with an accepted request). A 
notion of competence that takes into account such issues is studied in [3] . 

Acknowledgements 

This research has been funded by the European Union as part of the SOCS 
project (IST-2001-32530). The last author has been partially supported by the 
MIUR (Italian Ministery of Eduction, University, and Research) programme 
“Rientro dei Cervelli”. 



References 

1. M. Alberti, M. Gavanelli, E. Lamma, P. Mello, and P. Torroni. Specification and 
Verification of Agent Interactions using Social Integrity Constraints. In W. van der 
Hoek et al., editors, Proceedings of the Workshop on Logic and Communication in 
Multi-Agent Systems (LCMAS-2003), 2003. 

2. L. Amgoud, N. Maudet, and S. Parsons. Modelling Dialogues using Argumentation. 
In Proceedings of Ath International Conference on MultiAqent Systems flCMAS- 
2000). IEEE, 2000. 

3. U. Endriss, W. Lu, N. Maudet, and K. Stathis. Competent Agents and Customis- 
ing Protocols. In A. Omicini et al., editors, Proceedings of the Ath International 
Workshop on Engineering Societies in the Agents World (ESAW-2003), 2003. 




Logic-Based Agent Communication Protocols 



107 



4. U. Endriss, N. Maudet, F. Sadri, and F. Toni. Aspects of Protocol Conformance 
in Inter-agent Dialogue. In J. S. Rosenschein et al., editors, Proceedings of the 2nd 
International Joint Conference on Autonomous Agents and Multiagent Systems 
(AAMAS-2003). ACM Press, 2003. Extended Abstract. 

5. U. Endriss, N. Maudet, F. Sadri, and F. Toni. Protocol Conformance for Logic- 
based Agents. In G. Gottlob and T. Walsh, editors, Proceedings of the 18th Interna- 
tional Joint Conference on Artificial Intelligence (IJCAI-2003). Morgan Kaufmann 
Publishers, 2003. 

6. M. Esteva, J.-A. Rodriguez-Aguilar, C. Sierra, P. Garcia, and J. L. Arcos. On the 
Formal Specification of Electronic Institutions. In F. Dignum and C. Sierra, edi- 
tors, Agent Mediated Electronic Commerce: The European AgentLink Perspective. 
Springer- Verlag, 2001. 

7. R. Fernandez and U. Endriss. Abstract Models for Dialogue Protocols. In M. Marx, 
editor, Proceedings of the 5th International Tbilisi Symposium on Language , Logic 
and Computation, 2003. 

8. Foundation for Intelligent Physical Agents (FIPA). Communicative Act Library 
Specification, 2002. http://www.fipa.org/specs/fipa00037/. 

9. F. Guerin and J. Pitt. Guaranteeing Properties for E-commerce Systems. In J. Pad- 
get et al., editors, Agent-Mediated Electronic Commerce IV: Designing Mechanisms 
and Systems. Springer- Verlag, 2002. 

10. N. Jennings, S. Parsons, P. Noriega, and C. Sierra. On Argumentation-based 
Negotiation. In Proceedings of the International Workshop on Multi- Agent Systems 
(IWMAS-1998), 1998. 

11. A. Jones and M. Sergot. On the Characterisation of Law and Computer Sys- 
tems: The Normative Systems Perspective. In Deontic Logic in Computer Science: 
Normative System Specification. John Wiley and Sons, 1993. 

12. H. R. Lewis and C. H. Papadimitriou. Elements of the Theory of Computation. 
Prentice-Hall International, 2nd edition, 1998. 

13. J. Pitt and A. Mamdani. A Protocol-based Semantics for an Agent Communication 
Language. In Proceedings of the 16th International Joint Conference on Artificial 
Intelligence (IJCAI-1999). Morgan Kaufmann Publishers, 1999. 

14. F. Sadri, F. Toni, and P. Torroni. Dialogues for Negotiation: Agent Varieties and 
Dialogue Sequences. In Proceedings of the 8th International Workshop on Agent 
Theories, Architectures and Languages (ATAL-2001). Springer- Verlag, 2001. 

15. M. P. Singh. A Customizable Coordination Service for Autonomous Agents. In 
Proceedings of the fth International Workshop on Agent Theories, Architectures 
and Languages (ATAL-1997), 1997. 

16. M. P. Singh. Agent Communication Languages: Rethinking the Principles. IEEE 
Computer, 31(12):40-47, 1998. 

17. W. W. Vasconcelos. Expressive Global Protocols via Logic-based Electronic Insti- 
tutions. In J. S. Rosenschein et al., editors, Proceedings of the 2nd International 
Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2003). 
ACM Press, 2003. Extended Abstract. 

18. P. Yolum and M. P. Singh. Flexible Protocol Specification and Execution: Ap- 
plying Event Calculus Planning Using Commitments. In Proceedings of the 1st 
International Joint Conference on Autonomous Agents and Multiagent Systems 
(AAMAS-2002). ACM Press, 2002. 




Protocol Specification 
Using a Commitment Based ACL*’** 



Nicoletta Fornara 1 and Marco Colombetti 1 ’ 2 

1 Universita della Svizzera italiana, via Buffi 13, 6900 Lugano, Switzerland 

nicoletta. fornaraOlu.unisi . ch 

2 Politecnico di Milano, Piazza L. Da Vinci 32, 1-20133 Milano, Italy 
marco . colombettiOlu.unisi . ch 



Abstract. We propose a method for the definition of interaction proto- 
cols to be used in open multiagent systems. Starting from the assumption 
that language is the fundamental component of every interaction, we first 
propose a semantics for Agent Communication Languages based on the 
notion of social commitment , and then use it to define the meaning of 
a set of basic communicative acts. Second, we propose a verifiable and 
application-independent method for the definition of interaction proto- 
cols, whose main component is the specification of an interaction diagram 
specifying which actions may be performed by agents under given con- 
ditions. Interaction protocols fully rely on the application-independent 
meaning of communicative acts. We also propose a set of soundness con- 
ditions that can be used to verify whether a protocol is reasonable. Fi- 
nally, our approach is exemplified by the definition of an interaction 
protocol for English auctions. 



1 Introduction 

Interaction Protocols are patterns of behavior that agents have to follow to 
engage in a communicative interaction with other agents within a multiagent 
system (MAS). The specification of interaction protocols is crucial for the de- 
velopment of a MAS: the advent of Internet makes it urgent to develop general, 
application-independent methods for the definition of interaction protocols, to 
be used as components of open, dynamic, heterogeneous, and distributed inter- 
action frameworks for artificial agents. Indeed, the definition of new interaction 
protocols is a critical task, because a badly designed protocol may lead to un- 
successful interactions; thus there is a need for general methods, criteria, and 
tools for protocol design. We think that there are some important properties 
that interaction protocols for open frameworks have to satisfy. In particular, an 
interaction protocol should: 
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— Specify legal sequences of communicative acts that form a complete inter- 
action within a system. Every communicative act used in a protocol should 
maintain its meaning, as defined in a general, application-independent com- 
municative act library of a standard Agent Communication Language (ACL). 

— Enable interactions among purely reactive agents, that blindly follow a given 
protocol, and deliberative agents, that are able to reason about the conse- 
quences of actions, and decide whether to take or not to take part in an 
interaction. 

— Allow for effective verification that agents behave in accordance to the spec- 
ifications of the interaction protocol. 

Moreover, a general method for the development of interaction protocols should 
allow a designer to verify whether a protocol is “sound” with respect to general, 
application-independent soundness criteria. 

So far, several approaches to the definition of interaction protocols have 
been proposed. Some authors define interaction protocols as finite state ma- 
chines or Petri nets (see for example [4] and [10]), but do not take into ac- 
count the meaning of the exchanged messages, which in our opinion is crucial 
to obtain the properties listed above. Other approaches take into account the 
meaning of the exchanged messages, but do not rely on a standard ACL with 
application-independent semantics; for instance Esteva et al. [5] specify the pro- 
tocols available in an electronic institution using finite state machines, but define 
the meaning of only some of the message types using ad-hoc rules. An example 
of interaction protocol specification which fully takes into account the meaning 
of the exchanged messages is proposed by Yolum and Singh [18], who intro- 
duce a method based on event calculus to define protocols that may be used 
by artificial agents to determine flexible paths of interaction complying with the 
specifications. The main difference between Yolum and Singh’s proposal and the 
one put forward in this paper is that with the method described in this paper 
all the preconditions and effects of the performance of communicative acts on 
the state of the interaction are completely specified; we also propose a method 
through which protocol designers may verify if a protocol is sound with respect 
to a number of general, application-independent soundness criteria related also 
to the meaning of the exchanged messages. 

Our approach to agent interaction presupposes the definition of a standard 
ACL with unambiguous semantics. In a previous paper [6] we have shown how 
the semantics of an ACL can be defined in terms of (social) commitments. Our 
definitions set the rules for the execution of communicative acts, which are re- 
garded as commitment-manipulation actions. Starting from such an analysis, 
which will be briefly summarized in Section 2, we show how an interaction pro- 
tocol can be defined. It is important to remark that our protocols are defined 
starting from the communicative act library of a predefined ACL, and that all 
communicative acts preserve their general meaning when used within a proto- 
col. As we shall see, an interaction protocol mainly consists of a set of rules that 
regulate the performance of certain communicative acts; part of these rules are 
expressed in terms of an interaction diagram that specifies which actions can be 
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performed by the agents at every stage of the interaction. Of course, an arbitrary 
collection of rules does not necessarily define a reasonable interaction protocol. 
We therefore propose a set of application-independent and verifiable soundness 
conditions , which guarantee that protocols possess certain properties that are 
crucial for a successful interaction. Such conditions are expressed in terms of the 
content of the system state at each stage of the interaction, as consequence of 
the performance of communicative acts. 

The paper is organized as follows. Section 2 introduces a commitment-based 
framework for the definition of an ACL, and a minimal communicative act library 
that we consider essential to describe communicative interactions in an open 
MAS. In Section 3 we define a general method for the definition of interaction 
protocols, and introduce a set of soundness conditions, related to the meaning of 
the messages exchanged by the agents, which may be used to validate interaction 
protocols. In Section 4 we present a specification of a form of English auction, 
an interaction protocol widely used in electronic commerce applications, based 
on the formalism presented in the paper. Finally, in Section 6 we draw some 
conclusions. 

2 A Commitment-Based Agent Communication Language 

A complete operational specification of a commitment-based ACL and a dis- 
cussion of its motivations can be found in [6]. The semantics proposed in that 
paper is given by describing the effects that sending a message has on the social 
relationship between the sender and the receiver of the message using an unam- 
biguous, objective, and public concept, that is, social commitment. We assume 
that the open system in which artificial agents interact consists of the following 
components: 

— A group of registered agents {a, b , ...}. 

— A variable set of commitment objects {Ci, (A, ...}, which are instances of the 
commitment class discussed below. 

— A variable set of temporal proposition objects { P , Q, ...}, which are instances 
of the corresponding class discussed below, and are used to express proposi- 
tions about the application domain and the interaction process. 

— A fixed set of actions that agents may perform, including both communica- 
tive acts belonging to a communicative act library and application domain 
actions. 

— A fixed set of event-driven routines that automatically update the state 
of commitment objects on the basis of the truth value of its content and 
condition. These routines are represented by update rides as described in 
Table 1. 

— A set of domain-specific objects {0±, O 2 , ■■■}, which represent entities of the 
application world. Such entities may possess both “natural” or and “institu- 
tional” attributes; for example, the color of a product being sold is a natural 
attribute, while the price of the same product is an institutional attribute. 
Natural attributes are assumed to reflect the physical properties of the cor- 
responding entities of the real world, and typically cannot be changed during 
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Table 1. Update Rules. 
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an interaction (of course, they might be changed if some of the interacting 
agents were assumed to be physical robots) . On the contrary, institutional at- 
tributes can be affected by the performance of certain communicative acts, 
in particular by declarations (as discussed below). We assume that each 
domain-specific object has a value-setting method for each of its institu- 
tional properties; for example the method “setState()” can be invoked to set 
the “state” property. 

— A fixed set of roles {role\,role 2 , This concept is introduced to abstract 
from the specific agents that take part in an interaction. 

— A fixed set of authorizations associated to roles, that specify which agent is 
authorized to perform a particular declaration (see Section 2.1 for details). 

Commitment objects are used to represent the network of commitments bind- 
ing the interacting agents; they have an internal structure, a life cycle, and a set 
of methods available for manipulation. The internal structure of a commitment 
object consists of the following fields: 

— a unique commitment identifier ( id ); 

— a reference to the commitment’s debtor, that is, the agent that has the com- 
mitment; 

— a reference to the creditor, that is, the agent relative to which the debtor is 
committed; 

— the commitment’s content, that is, the representation of the proposition (de- 
scribing a state of affairs or a course of action) to which the debtor is com- 
mitted relative to the creditor; 

— the commitment’s conditions, that is, a list of propositions that have to be 
satisfied in order for the commitment to become active; 

— a state, taken from the finite set {unset, cancelled, pending, active , f ulfilled, 
violated}, used to keep track of the dynamic evolution of the commitment; 
and 

— a timeout, which is relevant only in the case of unset commitments, and will 
therefore be treated as an optional parameter. It represents the time limit 
for the debtor of an unset commitment to accept, fulfill or reject it. After it 
is elapsed the activation of rule 7 transforms the commitment to cancelled. 
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Commitment objects will be represented with the following notation: 

Cristate, debtor , creditor, content\conditions[, timeout]). 

We use temporal proposition objects to represent the content and the conditions 
of a commitment. A temporal proposition object consists of the following fields: 

— a statement in a suitable language which may state that: (i) a certain state of 
affairs holds; (ii) an action has been performed; (iii) a specific commitment 
with certain specific attributes holds; 

— the truth-value of the statement, which may be true (1), false (0) or undefined 

(-L); 

— a time interval , which may go from a single instant to the entire life of the 
system, relative to which the statement is considered; 

— and a temporal mode, either (V) or (3), which specifies whether the statement 
should be true for the whole time interval or on at least an instant of the 
time interval. 

We assume that the truth value of temporal proposition objects is updated 
by a suitable “notifier” . In particular: if the mode is ’3’ the notifier sets the truth- 
value to true if the statement becomes true at any point of the time interval, 
otherwise sets it to false when the time interval expires; if the mode is ’V’ the 
notifier sets the truth-value to false if the statement becomes false at any point 
of the time interval, otherwise sets it to true when the time interval expires. It is 
important to remark that the truth value of a temporal proposition object can 
switch from ± to 1 or 0, but then cannot change any more. In particular cases, 
as we shall see, it is possible to infer in advance that the statement of a temporal 
proposition object can no longer become true (false) within the associated time 
interval. In this case the notifier may set the truth value to false (true) before 
the time interval expires. To do so, the notifier may exploit specific inference 
rules (more on this later). 

Temporal proposition objects are represented with the following notation: 

Pidentifier (statement, time interval, mode, truth-value) . 

As we have already said, temporal proposition objects are used to represent 
content and conditions of a commitment. In particular the conditions of a com- 
mitment consist of a list [P, Q, ...] of temporal proposition objects that have to 
be satisfied in order for the commitment to become active. The truth value of a 
list of temporal proposition objects is computed as follows: (i) an empty list of 
temporal proposition objects is true; (ii) a true temporal proposition object is 
removed from the list; (iii) a list containing a false proposition object is false. 

To make the notation simpler, when the list of conditions contains one tem- 
poral proposition object the square brackets are dropped. We also remark that a 
temporal proposition object, used to express the content or a condition of a com- 
mitment object, may in turn represent another commitment object. In particular 
temporal proposition objects can be used to represent conditions on the temporal 
evolution of commitments. An example of this is given in Subsection 2.1. 
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Fig. 1 . The life-cycle of commitments. 



The life cycle of a commitment object is described by the finite state machine 
in Figure 1. The state of a commitment can change as an effect of the invoca- 
tion of its basic methods (solid lines) or of environmental events (represented 
by dotted lines labelled with the name of the related update rule described in 
Table 1), that is, of events that change the truth- value of a commitment’s con- 
ditions or content. We assume that when a commitment object is declared, the 
constructor of the class creates an empty commitment object, CiQ. We represent 
the invocation of a method by the name of the object followed by a dot and by 
the name of the method with its parameter list. Commitments are created and 
manipulated through the following basic operations: 

— Make commitment. By invoking the method 

mc{a , b , P, Q ) with arbitrary debtor a, creditor 6, content P, and condition 
list Q , a new unset commitment object is created: 

Ci().mc(a , b, P, Q[, to}) —> Ci(unset , a , b , P\Q[, to]) 

— Set commitment. The method sc(s) changes the current state of an existing 
commitment object to s : 

Ci( -, a, b, P\Q).sc(s) -i Ci(s, a, b, P\Q) 

— Add condition. The method ac(R) adds a new temporal proposition object 
R to the conditions of the commitment: 

Ci(s,a,b, P\Q).ac(R) ->■ Ci(s,a,b,P\R»Q), 

where the symbol • denotes the operation of inserting a new element in a 
list. 

Basic operations should not be viewed as actions that are directly performed by 
agents. Rather, they are low-level primitives used to implement operations on 
commitment objects, more specifically, agents manipulate commitments through 
a communicative act library. 

In this paper we do not tackle the crucial problem of completely formalizing 
the electronic institution [5] where interactions among communicative agents 
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take place. However regarding the complete definition of the ontology of the 
commitment we think that it is possible to list a reasonable set of basic autho- 
rizations that have to be taken into account when new communicative acts are 
defined using operation on commitment objects. Such basic authorizations are: 

— any agent can create an unset commitment with arbitrary debtor and cred- 
itor; 

— the debtor of an unset commitment can set is to either pending or cancelled; 

— the creditor of an unset, pending, or active commitment can set it to can- 
celled. 

These basic authorizations may be modified or new ones may be introduced on 
the basis of the particular electronic institution where the interaction actually 
takes place. 

Finally note that we defined the conditions under which commitments are 
fulfilled or violated, but we are not concerned with the management of violations 
e.g. in terms of sanctions, because this aspect lies beyond the use of commitments 
for the definition of ACL semantics. 

2.1 Library of Communicative Acts 

We shall now define the meaning of the basic types of communicative acts as 
identified by Speech Act Theory [14]. We extend the definitions of [6] by intro- 
ducing the definition of a new commissive act, the conditional accept act, and 
a treatment of declarations; both will be used in the example of Section 4. In 
the following definitions the sign “ =def ” means that performing the action rep- 
resented on the left-hand side is the same as performing the action represented 
on the right-hand side, and the symbol means that the act represented on 
the left-hand side is actually performed through the invocation of the methods 
listed on the right-hand side. 



Assertives. According to Speech Act Theory, the point of an assertive act is to 
commit the sender, relative to the receiver, to the truth of what is asserted. We 
consider the inform act as our prototypical assertive act. This act is used by 
agent a to inform agent b that P is the case. In a commitment-based approach, 
an act of informing can be defined as follows ( TRUE is the identically true 
temporal propositional object): 

inform(a, b, P) := { Cii).mc(a,b,P,TRUE ); 

Ci{unset, a, b, P\T RU E) ,sc(pending)} . 

The final result is an active commitment, thanks to the intervention of Update 
Rule 3. 



Directives. As defined in Speech Act theory, the point of a directive act is to 
get the receiver to perform an action (possibly a speech act) . We treat request 
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as our basic directive act, and define it as the creation of an unset commitment 
with the sender as the creditor and the receiver as the debtor. The request by 
agent a to agent b to bring about P if condition list Q is satisfied is defined as: 

request(a, b, P, Q[, to]) := { Ci().mc(b , a, P, Q[, to])}. 

The receiver of a request can react in three different ways: it can perform the 
requested action, accept the request, or refuse it. Questions (or queries) are 
requests to be informed about something. Here we deal with only wlr-questions; 
for a definition of yes-no-questions see [6]. In wh-questions the requested act of 
informing cannot be completely described by the sender (otherwise, why should 
it ask the question?). In this cases the sender provides a “template” for the 
answer, that is, a temporal proposition object S'(x) containing a meta- variable 
x that the receiver has to replace with a constant value c. A query has therefore 
the form: 



request(o, b , P,TRUE) 

where P.statementQ = inform(6, a, S'(x)) 
inform(6, a, S'(x)) =d e f inform(6, a, S(c)) 
for some constant value c. 

This definition implies that the performance of the requested inform act with 
the temporal proposition S(c) as a parameter makes the temporal proposition 
P true. Indeed, as remarked by Searle [15] the concept of a question is more 
general: by a question, an agent may request the execution of a non-assertive 
communicative act (like a directive, or a commissive). However, our definition 
above easily generalizes to such cases (an example can be found in Section 4). 



Commissives. The point of a commissive act, as defined by Speech Act theory, 
is to commit the debtor, relative to the creditor, to the execution of an action 
of a given type. Here we define the basic commissive act of promising: 

promise(a, b, P, Q) := {C)().rac(a, 6, P, Q); Ci(unset,a,b,P\Q).sc(pending)}. 

To make an unconditional promise the constant proposition object TRUE is used 
as the condition, and thus the pending commitment created by the promise is 
immediately turned into an active commitment by Update Rule 3. Three types of 
commissive acts can be performed only in connection with an unset commitment, 
namely accept, conditional accept and reject. Accepting and rejecting are 
defined as follows: 

preconditions : 3 Ci(unset, b, a , P\Q[, to])) 

accept(6, a, Ci(unset , b , a, P\Q[, to])) := { Ci(unset , 6, a, P|Q[, to]) .sc(pending)} 



preconditions : 3 Ci(unset, b, a, P|Q[, to])) 

reject(6, a, C^unset, b , a, P\Q[, to])) := {Ci(unset, 6, a, P|Q[, to]) .sc(cancelled)} 
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Another useful commissive act is “conditional accept”, which may be used by 
agents to negotiate the condition of an unset commitment. In particular, con- 
ditional acceptance will appear in the example proposed in Section 4. In fact, 
in the English Auction Protocol at every round of the bidding process the auc- 
tioneer accepts the currently highest bid on condition that no higher bids will 
be accepted later. In general, the debtor of an unset conditional commitment C) 
can accept it provided that an additional condition, represented by a temporal 
proposition object, holds. Conditional acceptance transforms an unset commit- 
ment into a pending commitment, and adds a new condition to the original 
condition list of the unset commitment: 

preconditions : 3 Ci(unset , b, a, P\Q[, to])) 
condAccept(6, a, Ci(unset, b, a, P\Q{, to]),R ) := 

{ Ci{unset , b, a, P\Q[, to]).ac(i?); Ci(unset , b, a, P\R • Q[, to\) .sc(pending)} 

Note that when condition R becomes true, the debtor is left with a pending 
conditional commitment of the form expending, b, a, P\Q). 



Proposals. A proposal is a combination of a directive and a commissive act. 
Even if proposals are not basic acts, they deserve special attention because they 
are crucial in many interesting application fields, like for example electronic 
commerce. A propose act can be defined as the parallel execution of a request 
and a promise, as denoted by the symbol ||: 

propose(o, b, P, Q[,to]) =def request (a, b, P, Q[, to]) || promise(a, b, Q, S) 
where S.statementQ = expending, b, a, P\Q) 

Note that in the above definition the statement of temporal object S represents 
the commitment object C ^pending, b, a, P\Q). 



Declarations. Declarations are a special type of communicative acts. Examples 
of declarations are “I pronounce you man and wife” or “I declare the auction 
open”. The point of a declaration is to bring about a change in the world, ob- 
viously not in the physical or natural world but in an institutional world [16, 2], 
that is, a conventional world relying on common agreement of the interacting 
agents (or, more precisely, of their designers). Declarations actually change the 
institutional world simply in virtue of their successful performance. In our inter- 
action framework, to treat declarations we introduce objects with institutional 
properties, that is, conventional properties that result from common agreement, 
like for example the ownership of a product. Such properties can be affected by 
declaration acts. It is however necessary to identify which agents are authorized 
or empowered to perform a given declaration act in the system. Typically, autho- 
rizations are granted to agents in virtue of the role they play in an interaction, 
and thus authorizations are naturally associated to roles. To do so, we need to 
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introduce a construct to express that an agent having a given role in the inter- 
action system is empowered to bring about an institutional change of a given 
kind: 



preconditions : empowered(rolei,Ok-setPropj()) A a.roleQ = rolei 
dec\&re(a, Ok-propj = x) := {Ok-setPropj(x)}. 



3 Interaction Protocols 

Having defined an essential Communicative Act Library we can now proceed to 
the specification of interaction protocols. An interaction protocol is defined by an 
environment and an interaction diagram. In particular, a protocol’s environment 
defines: 

— A nonempty set of roles that agents can play in the interaction. To each role, 
a set of specific authorizations may be associated. 

— A nonempty set of participants, which are the agents interacting by using the 
protocol. Every participant must play a well-defined role in the interaction. 
The set of participants may vary during the execution of the protocol, but 
is always finite. 

— A possibly empty set of global constants and variables, that may be subject 
to global constraints. 

— A collection of commitment objects, temporal proposition objects, and 
domain-specific entities used to represent all concepts involved in the in- 
teraction. 

— A set of authorizations, associated to roles, to perform certain institutional 
actions, in particular declarations. 

A protocol’s interaction diagram specifies which actions may be performed 
by each agent at each stage of the interaction. More precisely, an interaction 
diagram (see for example Figure 2) is defined by a finite graph in which: 

— Every node represents a state of the interaction. To every state we can asso- 
ciate a representational content, that is, the set of all facts that hold at the 
state, expressed in terms of: protocol variable values, commitment objects, 
temporal proposition objects, and domain-specific objects. 

— There is a single distinguished initial node, with no incoming edge, and a set 
of distinguished final nodes , with no outgoing edge. The interaction starts 
from the initial node and ends when a final node is reached. 

— Every edge describes a transition from a state to another state. A transition 
may correspond to the execution of a communicative act or to the occurrence 
of a relevant environmental event; when the transition occurs, the content of 
the target state can be completely computed from the content of the source 
state, and from the semantics of the communicative act or a description of 
the environmental event responsible for the transition. 
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Fig. 2. Interaction diagram of the English Auction Protocol. 
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— When more than one communicative-act edge goes out of a given node, it is 
possible to specify the conditions (defined as arbitrary Boolean expressions) 
under which each act may be executed. As a whole, the set of condition- 
action pairs going out of a node behaves like a guarded command [3]: at 
least one of the actions must be executed, but the agent specified as the actor 
of the action is free to choose which action to perform among those whose 
guard is true. If all guards are mutually exclusive, the guarded command is 
equivalent to a sequence of if-then statements. 

— It is possible to associate a cardinality to communicative act edges. In partic- 
ular cardinality “1 to n” means that the same message is sent by one agent 
to n agents, and cardinality “1 to 1” means that a message is send by one 
agent to another agent. 

3.1 Soundness Conditions 

To describe a sensible interaction pattern, an interaction protocol must satisfy a 
number of general, application-independent soundness conditions. A first, fairly 
trivial, set of conditions concerns the topology of the interaction diagram: 

— Every node of the interaction diagram must be reachable from the initial 
node. 

— There must be at least a final node. 

Another, less trivial, set of soundness conditions concerns the content of states. 
Such conditions, express constraints related to the meaning of the exchanged 
messages, as defined by the communicative act library adopted. 

— All communicative acts that are allowed by a protocol at state s must have 
their preconditions satisfied by the content associated to s when their guard 
is true. This condition guarantees that all communicative acts allowed by 
the interaction protocol may actually be executed. 

— All commitments included in the content of a final state must be cancelled, 
fulfilled, or violated. This condition guarantees that the whole interaction 
has been completed. 

An interesting problem is raised by the fact that during the execution of a pro- 
tocol, the same state may be reached from the start state following different 
paths (i.e., performing different chains of actions). For example, a certain state 
of an interaction could be reached because an agent has autonomously made a 
promise or because the agent was requested to make a promise, accepted the 
request, and then fulfilled the resulting commitment by actually making the 
promise. If we abstract from the different paths, we intuitively feel that the in- 
teraction has reached the same state; however, if we compute the content of the 
state we get different results. The point is that these results, although different, 
are equivalent from the point of view of the interaction, in that they have the 
same “commissive import”. More precisely, we say that state s is equivalent to 
state s' if and only if the contents of s and s' are identical, with the only excep- 
tion of commitments that are fulfilled, violated, or cancelled. We can therefore 
formulate another soundness condition: 
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— If a state of an interaction can be reached through different paths, the con- 
tents of the state computed along the different paths must be equivalent. 

The situation is even more complex when the definition of an interaction protocol 
has a loop, that is, a cycle in the interaction diagram. Interaction loops naturally 
appear when a sequence of communication acts can be repeated several times, 
like for example in the English Auction Protocol (Section 4). The existence of 
loops makes it possible to reach the same state following different paths in the 
interaction diagram. In this case, however, the notion of equivalence discussed 
above is still necessary but no longer sufficient. This problem is well known in 
computer programming, and it can be solved by introducing the concept of a loop 
invariant. For example, consider again the protocol for an English Auction. At 
a generic iteration, the auctioneer is committed to selling the product currently 
under the hammer to a specific agent for a specific price, on condition that no 
higher price will be offered. Of course, the specific agent that made the highest 
offer, as well as the associated price, will change from one iteration to another 
one. However, we can describe the situation in terms of loop invariants, by saying 
that the auctioneer is committed to selling the product to the agent that made 
the highest offer, for the price defined by such an offer, on condition that no 
higher offer will be made. The soundness condition given above can now be 
reformulated as follows: 

— If a state of an interaction can be reached through different paths, the con- 
tents of the state computed along the different paths, expressed in terms of 
suitable invariants, must be equivalent. 



4 The English Auction Protocol 

In this section we present a specification of a form of English auction proto- 
col using the framework proposed so far. We chose this protocol as an example 
because it is used in many electronic commerce applications on the web, and 
because it is fairly complex: in particular, it is an interesting example of iter- 
ative interaction protocol. In this example we consider the interaction process 
needed to sell a single product o, which can obviously be repeated to sell several 
products. 

4.1 The Environment 

The environment of the English Auction Protocol includes the following ele- 
ments: 

— Roles. Auctioneer and Client. 

— Participants. One agent, a, in the role of Auctioneer and n agents, {p\, ..., 
p n }, in the role of Client. 

— Constants and Constraints. t max , the maximum duration of the auction; 
1 1 , the deadline for the payment; t- 2 , the deadline for the delivery. t, rnax < 
tl < t‘2- 
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Domain-specific entities. 

Object o, representing the product on sale, with a resPrice field for the 
reservation price. 

Object A, representing the auction, with fields for the following variables 
(initialized as indicated): state = “closed”; askPrice = 0; t en d automatically 
set to t S y S t e m when state is set to “closed”; ti nac tivity he. the maximum time 
of inactivity between two subsequent bids. 

The action of transferring the ownership of an object or of a sum of money 
to another agent is an institutional action involving the institutional notions 
of ownership and money. For the sake of simplicity, we treat here this action 
as a primitive domain action. The fact that agent a transfers to agent b the 
ownership of x (an object or a sum of money) is represented by give(a , b , x). 

— Variables. The environment has the following variables (initialized as indi- 
cated): new AskPrice = 

o.resPrice ( ); value w i„ = 0; t sy stem, a global clock accessible to all par- 
ticipants; tbid = 0, the time of the last accepted bid; i a counter that is 
automatically incremented every time the bidding process is iterated. 

— Authorizations. The auctioneer is empowered to open and close the auc- 
tion and to set the ask price of the product on sale: 

empower ed( Auctioneer, A.setState()) , 
empower ed( Auctioneer, A.setAskPrice()) . 



Scheme of Temporal Proposition Objects. In the interaction framework 
proposed so far the content language used to express the content and the con- 
dition fields of commitment objects is based on the use of temporal proposition 
objects. Given that in complex interactions like the ones that follow the English 
Auction Protocol many temporal proposition objects are involved, we concisely 
describe them through schemes, that represent possible temporal proposition 
object in parametric form. Parameters will be bound to specific values when 
the interaction actually takes place and an instance of the temporal proposition 
object is created (truth values are always initialized to _L, and therefore are not 
indicated in schemes). In our example, parameter now is initialized at t sys tem 
when the temporal proposition object is created; and parameter v and v' are 
used to indicate an amount of money. 

— Scheme Pj represents the proposition “the auctioneer gives product o to 
client pj , in the time interval from the end of the auction to t ?” : 
P j (give(a,p j ,o),t end ...t 2 ,^); 

— Scheme Qj tV represents the proposition “client pj gives the amount v of 
money to the auction house represented by the auctioneer, in the time in- 
terval from the end of the auction to t\”‘. Qj iV {give{pj,a,v),t end ...ti,3); 

— Scheme Sjj represents the proposition “client pj makes a proposal during 
iteration i”: 

Sj ti (propose(pj, a, Pj,Qj(x)),now...now + ti nacUv u y , 3 ); 
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— Scheme Uj )V represents the proposition “the auctioneer is committed, relative 
to client pj, to proposition Pj under condition Qj, v , in the time interval from 
now to the end of the auction” : 

U j, v (C j (pending, a,pj, Pj\Q j<v ),now...t end , 3); 

— Scheme W v > represents the proposition “the auctioneer does not accept any 
proposal with value greater than v' in the time interval from now to the end 
of the auction” : 

W v > > 3 j (cond Accept (a, pj , Ci d 2 j(unset , a,pj, Pj\Qj, v ), W v ) A v > v')), 
now. ..t end , 3); 



Communicative Acts. In this section specific conditions (guards) for the per- 
formance of the communicative acts used in the English Auction Protocol are 
given. Obviously in order that the communicative act is successfully performed 
also the preconditions defined in the Library of Communicative Acts have to be 
satisfied. Some of these acts have to be repeated at every round i of the bidding 
process. In order to be able to refer to commitment objects created by previously 
preformed communicative acts we report also the effects of the performance of 
communicative acts: they can also be computed from the definitions given in 
the library of communicative acts. Therefore in the protocol communicative acts 
retain the semantics defined in the library of communicative acts, this contrasts 
with the approaches in which the semantics of communicative acts is affected 
by the protocol [ 12 ]. Moreover the performance of certain communicative acts 
changes the value of some environmental variables. 

— The auctioneer declares the auction open (state s 0 ). 

guards : A.statei) = ” closed' 
declare(a, A. state = "open") 

— The auctioneer declares the current ask-price of the ongoing auction (state 

Si, S6> Sn). 

guards : A.askPricef) < newAskPrice 
declare(a, A.askPricef) = newAskPrice ) 

— The auctioneer makes the “call for proposals” (state S2, S5, S7, sio). 

request (u, pj , Sj ^ , T RU E , ti na ctivity) 

ef f ects . Cid\j{,'u , 'nset)Pj)Ci,Sj^iiTRUE,ti nac ti v ity) 

— One participant makes its proposal (state S3, Ss). 



guards : {( t system < t max) 7 (f system tfrid < t inactivity)} 

propose^, a, Pj,Qj, v ) 

effects : {C id 2 j (unset, a,p j: Pj\Qj, v [, to]),C id 3 j(pending,pj,a, Qj, v \U jtV ), 
S f v . truth cvalueQ = 1} 
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— If the value of the proposal is greater than the current ask-price the auction- 
eer has to accept it (state S4, sg). 

guards : v > A.askPriceQ\ 

condAccept(a,pj, Cid 2 j(unset, a,pj, Pj\Qj tV ), W v ) 

effects : {Cid 2 j (pending, a,pj, Pj\[Qj,v, W„]), V v' <v W v '.truthjvalue() = 0} 
variable updates : {newAskPrice = v, tud — t system } 

— If the value of the proposal is less than or equal to the current ask-price the 
auctioneer has to reject it (state S4, sg). 

guards : v < A.askPriceify, 

reject (a, pj , C id 2 j (unset, a, pj , Pj\Qj , v )) 

effects : {Cid 2 j (cancelled, a,pj , Pj\Q j^ v ),Uj tV . truth _value() = 0} 

— The auctioneer can declare closed the auction only if the time of inactivity is 
equal to the constant value defined at the beginning of the auction or when 
the fixed end time of the auction is reached (state S3, ss). 

guards : {(t sys tem> ^rnax)^ {^system tinactivity) •> A.statef ) — open } 

declare(a, A. state = ” closed”) 

effects : { W va i uewin . truth jvalue ( ) = l,U j>va i uewin .truthjualue() = 1} 
variable updates : {t en d = t sys tem,value W i n = newAskPrice} 



4.2 Interaction Diagram 

The interaction diagram that specifies the English auction Protocol is reported 
in Figure 2. It is possible to verify that an actual interaction complies with the 
protocol by checking that the sequence of communicative acts bring from the 
unique start state to one of the final states. Moreover it is possible to prove 
the soundness of this protocol specification. In particular the contents of each 
state accessible through different paths results equivalent. For states sg, S 10 , Sn, 
that are in the loop of the protocol, it is necessary to identify a loop invariant 
describing pj as the client who made the highest offer. 

5 Discussion 

Both in this paper and in [6] our main research focus has been on the defini- 
tion of an operational semantic for an agent communication language based on 
the notion of social commitment. In particular in this paper we tried to show 
the potentialities of this approach to define application independent and verifi- 
able interaction protocols for multiagent systems and to verify whether a given 
protocol is sound with respect to general soundness criteria. 

Given that the definition of an agent communication language is strictly 
related to the definition of a content language, in our formalization we introduced 




124 



Nicoletta Fornara and Marco Colombetti 



the notion of temporal propositions to express the content of communicative 
acts. We do not think that the proposed structure for temporal propositions 
is enough expressive to cover all the interesting situations where agents need 
to interact, but it has proved enough expressive for the formalization of some 
crucial examples like the protocol of the proposals and the English auction. The 
complete investigation of how to operationally define a content language with 
some specific characteristics is beyond the scope of these papers. 

Whatever the structure of the content language may be, a crucial requirement 
of our approach is that the events that have to happen in order to make the 
content and the condition of commitment objects (formalized through temporal 
proposition objects) true or false have to be observable. In our model there are 
mainly two types of events that are relevant for interaction systems: 

— the events that happen in the real world external to the interaction system 
and which are signaled to the system through suitable “sensors”, for example 
a change in the balance of a bank account; 

— the events that happen directly in the interaction system, for example the 
exchange of a message. 

In a real application there may be observable and unobservable events, for 
example because some events could be too expensive to be observed, or because 
it is impossible to ascertain who is the actor of the action that caused the event, 
or because its observation would violate the law in force about privacy, etc. 
In order for the proposed model to be actually adopted it is necessary that 
each application defines which are the types of events that can be observed 
and consequently can change the truth value of temporal propositions in the 
systems. This is not a limitation of the model but reflects objective limitations 
of any computational system. 

Another critical aspect of this approach could be the problem of how ar- 
tificial agents can become aware that a certain event has actually happened, 
that the information received is truthful, and consequently that the correspond- 
ing temporal proposition becomes true or false and the related commitment is 
therefore fulfilled or violated. We think that this is mainly a problem of trust in 
the source of information. In fact in our everyday life we may have: non-certifted 
interactions, like when we speak with our friends, and certified interaction, for 
example when we buy a house. In the same way in artificial interaction systems 
we can recognize a similar distinction: situations where agents interact with- 
out a certified mediator and situations where a certified mediator is required. 
In non-certified interactions agents can get information about events by trusted 
informers, like for example a bank or an electronic sensor. In case of violation 
of a commitment an agent can for example decide to reduce its level of trust in 
the defaulting agent. In certified interactions an “above parties” entity, trusted 
by all the participant and able to listen all the exchanged messages, provides 
the information about the state of affair and about the performed actions and 
in case of violation of commitments undertaken by one of the interacting agents 
takes certain legal proceedings. Regarding this point and in contrast to what 
is asserted in [8] by Jones and Parent, it is important to remark that in the 
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commitment approach to the definition of the semantics of ACLs the fact to be 
committed to some state of affairs described by proposition p is not a commit- 
ment to defend p if challenged by other agents, but it is a commitment to the 
truth of p. Such commitment could become violated if its content turns out to 
be false. A more complete discussion about the commitment approach and the 
conventional signals approach can be found in [17]. 

Keeping on with the discussion about other approaches, in [11] McBurney 
and Parsons propose a computational model of e-commerce called Posit Spaces 
based on the notion of joint commitments. In their ontology of commitment the 
available locutions are propose, accept and delete which are equivalent to our 
creation of an unset commitment, acceptance of the commitment and its can- 
cellation; in our proposal we defined also the rejection of a commitment that 
is important to avoid that a proposed commitment remains in the system for- 
ever. In Posit Spaces the dynamic evolution of the state of commitments and 
the basic authorizations on their manipulation are not explicit, what is said is 
that they follow the rules of the electronic institution. In our approach (see Sec- 
tion 2) we give a basic set of rules for the evolution of the commitment from 
unset to fulfilled, violated, or cancelled and basic authorizations on the manip- 
ulation of commitments that we think are the basic rules that every interaction 
system has to follow even if specific institution can change them or add new 
ones. Moreover the absence of any description of the structure of posits (one 
or more commitments) and of their meaning makes from our point of view the 
proposed operational model incomplete. An interesting aspect of Posit Spaces 
is the utilization of tuple spaces, a model of communication between computa- 
tional entities, and the introduction of an operational model to treat multi-party 
commitments. 

In [13] Rovatsos et al. propose a model of communication for open systems 
based on the notion of expected utility and on the idea that the meaning of 
communicative acts lies in their experienced consequences. This approach has 
the limitation to make some assumptions regarding the type of interacting agents 
in particular that they have a utility to maximize. Furthermore from our point 
of view the empirical semantics proposed can be an interesting model of how 
the human system of social rules and the meaning of speech acts have evolved 
during centuries, but it is hard to apply to open artificial agents interactions 
where the agents are heterogeneous and developed by different designers and 
where the interaction with the same set of agents usually is very short. 

Mallya et al. in [9] propose a content language, similar to the one proposed in 
our approach, where temporal aspects of the content of commitments are taken 
into account. In particular they enhance the expressive power of the language by 
introducing the possibility to nest multiple levels of time intervals and propose 
some results for the detection of commitment violation before the occurrence of 
the events to which the commitment is referred. 

In [1] Bentalrar et al. present a formal framework to represent conversations 
between agents based on the notion of social commitment and argumentation. 
Regarding to their treatment of the notion of commitment, they distinguish, 
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like us, between proposed commitment, accepted commitment and conditional 
commitment. The two approaches differ in the possible states of the commitment 
and in the set of allowed operations on commitments. However their approach, 
like also the approach presented in [9], provides for agents an action to fulfill 
and violate commitments, therefore is deeply different from our proposal where 
commitments become fulfilled or violated on the basis of the truth value of the 
temporal proposition and not for an action actively performed by an agent. 

In [17] Verdicchio and Colombetti give a logical model of the notion of com- 
mitment which is close even if not completely compatible with the one presented 
here. We plan to make the two models compatible in the near future. 

6 Conclusions 

In this paper we presented an application independent method for the definition 
of interaction protocols, based on the meaning of the exchanged messages, that 
can be used to define patterns of interaction in open, dynamic, and heterogeneous 
agent systems. The method proposed is based on a general ACL, whose seman- 
tics is defined in terms of commitments, and on a further component defining 
protocol-specific interaction rules. The resulting interaction protocols are ver- 
ifiable, in the sense that is possible to test whether an agent is behaving in 
accordance to it. Moreover, soundness condition are proposed to verify if a the 
structure of a given interaction protocol is reasonable. We also show how our 
method can be used to define a complex and common interaction protocol, the 
English Auction. 

With respect to our previous operational proposal [6] of a commitment-based 
ACL semantics, in this paper we introduce a treatment of conditional acceptance 
and of declarations. 

Our method for the definition of interaction protocols differs from most exist- 
ing proposals, in that it is based on the use of an application-independent library 
of communicative acts, whose meaning is fully preserved when they occur in a 
protocol. With respect to the proposal put forward by Yolum and Singh in [18], 
our approach is focussed on the protocol design phase more than on the possibil- 
ity of shortcutting predefined interaction patterns at run time. Indeed, we expect 
that agents used in practical application will mostly be simple reactive agents; 
if this idea is correct, proving the soundness of a protocol at design time is more 
important than allowing agents to plan intelligent variations of existing proto- 
cols. In principle, however, a deliberative agent with reasoning capabilities could 
understand our protocols on the basis of an ontology of commitment, linguistic 
knowledge (i.e., knowledge of a Communicative Act Library with semantics), 
and the ability to reason on interaction diagrams (i.e., a version of finite state 
machines) . 
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Abstract. As part of the goal of developing a genuinely open multiagent sys- 
tem, many efforts are devoted to the definition of a standard Agent Communi- 
cation Language (ACL). The aim of this paper is to propose a logical frame- 
work for the definition of ACL semantics based upon the concept of (social) 
commitment. We assume that agent communication should be analyzed in 
terms of communicative acts, by means of which agents create and manipulate 
commitments, provided certain contextual conditions hold. We propose formal 
definitions of such actions in the context of a temporal logic that extends CTL’ 
with past-directed temporal operators. In the system we propose, called CTL - , 
time is assumed to be discrete, with no start or end point, and branching in the 
future. CTL - is then extended to represent actions and commitments; in particu- 
lar, we formally define the conditions under which a commitment is fulfilled or 
violated. Finally, we show how our logic of commitment can be used to define 
the semantics of an ACL. 



1 Introduction 

One of the main goals in the field of autonomous agents is the development of genu- 
inely open multiagent systems. As part of this enterprise, many efforts are devoted to 
the definition of a standard Agent Communication Language (ACL). So far, two 
ACLs have been widely discussed in the literature: KQML [4] and FIPA ACL [5], but 
we do not yet have a universally accepted standard. In particular, there is no general 
agreement on the definition of ACL semantics. 

The aim of this paper is to propose a framework for the definition of ACL seman- 
tics based upon the concept of (social) commitment, thus adopting an approach that 
has already been proposed and discussed by some authors [12,2], In our view, a 
commitment-based approach to semantics has remarkable advantages over the more 
traditional proposals based on mental states (see for example [1,8]). In particular, 
commitments, contrary to mental states, are public and observable, thus they do not 
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need to be attributed to other agents by means of inference processes, and can be 
stored in public records for further reference. 

Our framework, like all major proposals in the field of ACLs, relies on the assump- 
tion that agent communication should be analyzed in terms of communicative acts. In 
our view, communicative acts are performed by agents to create and manipulate 
commitments. That is, agents modify the social state of a multiagent system by carry- 
ing out speech acts that affect the network of commitments binding agents to one 
another. For instance, when agent a informs agent b that p, then a becomes commit- 
ted, relative to b, to the fact that p holds. As we shall show in the rest of this paper, we 
can similarly model other kinds of communicative acts from the perspective of com- 
mitments. 

Previous versions of our model have been published elsewhere [2,6]. However, we 
try here for the first time to delineate a full logical model of commitment, including 
the aspects related to time. In Section 2 we illustrate some aspects of our model of 
time and action. In Section 3 we present a formal model of commitment. In Section 4 
we investigate on the relations between message exchanges, communicative acts and 
the creation and manipulation of commitments. Finally, in Section 5, we draw our 
conclusions and illustrate some future work. 



2 Time and Action 

2.1 Time 

For the treatment of time, we adopt a framework close to the CTL* temporal logic [3]. 
As is well known, CTL is a powerful logic of branching time, developed to prove 
properties of computational processes. In the context of agent interaction, we found it 
necessary to extend CTL' with past-directed temporal operators. In the system we 
propose, called CTL~and essentially equivalent to PCTL [9], time is assumed to be 
discrete, with no start or end point, and branching in the future. 

The formal language L of CTL* is the smallest set such that: 

AcL, where A is a suitable set of atomic formulae; 

nLcL, (LaL)cL; 

XX c L, XX c L, (LUX) c L, (LUX) c L; 

ALcL, ELcL. 

The intuitive meaning of the temporal operators is the same as in CTL*, with the 
additional stipulation that: 

X + means at the next instant (in the future); 

X means at the previous instant (in the past); 

U + means until (in the future); 

U means since (in the past). 

A and E are path quantifiers, respectively meaning for all paths and for some path. 

To define the formal semantics of L, let S be a set of states. A CTL frame F on S 
is an infinite tree-like structure on S, where every state has exactly one predecessor 
and a nonempty set of successors. 
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A path in frame F is an infinite sequence p = (p 0 ,...,p n ,...) of states, such that for 
every element p n in the sequence, element p m is one of the successors of p n in F. The 
subsequence of p starting from element p n is itself a path, and will be denoted by p . 
The set of all paths starting from state s will be denoted by Paths(s). Paths allow us to 
formalize the concepts of being “in the past” or “in the future” of some state. More 
precisely, we say that state s’ is in the future of s (in frame F) iff there is a path p such 
that s = p 0 and, for some n, s’ = p n . Symmetrically, we say that s’ is in the past of s (in 
frame F) iff there is a path p such that s' = p 0 and, for some n, s = p n . 

A CTL model is a pair M = (F,v), where F is a CTL - frame and v is an evaluation 
function assigning a Boolean truth value to every atomic formula at every state. We 
are now ready to define the truth conditions of an L formula in model M on path p: 



M,p |= (p , where (p is 


an atomic formula, iff v((p,p Q )= 1; 


M,p |= -i (p 


iff 


M,p y p. 


M,p\=(cpAl//) 


iff 


M,p j= (p and M,p |= y / ; 


M,p | = X> 


iff 


M,p l )= (p . ; 


M,p | =X cp 


iff 


for some path q, (q = p and M,q |= (p)\ 


M,P \= ( cpU*y/) 


iff 


for some n, ( M,p" |= ^rand for all m s.t. 
0 <m<n, M,p m |= <p)\ 


M,p |= ((pi)- if) 


iff 


for some path q and for some n, 

(q = p and M,q )= ^rand for all m s.t. 
0 <m<n, M,q" )= cpf 


M,p |= A cp 


iff 


for all qe Paths(jr 0 ), M,q |= (p . ; 


M,p \= E (p 


iff 


for some qe Paths(p 0 ), M,q j= (p. 



We define an L formula to be true in model M at state s iff it is true in M on all 
paths starting from s: 

M,s j= (p iff for all pe Paths(s), M,p |= (p . 

Finally, we define a formula to be valid iff it is true on all paths of every model: 

|= (p iff for all M and all p, M,p |= cp. 

Taking the temporal operators X + , X . U + , and U as primitives, we can introduce 
the following operators as abbreviations: F + (sometimes in the future). F (sometimes 
in the past), G + (always in the future), G (always in the past): 

F *cp = def trued* (p, 

F (p = dcf true U t/>, 

GV =def - | F + -. rp, 

G( P =def 1 F~ 1 

We also define a “weak until” and a “weak since” temporal operators: 

(p\N*y/ = def G *<p v (pd*y/, 

= d =f G ~(p v (p\Sy. 

Later on we shall use another derived operator, representing the intuitive concept 
of “until-and-no-longer”. This operator is defined as follows: 

(pZ*y/= m i^W>a GV^GAp). 
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In other words, (p 7.* iff is true iff in the future: t// never becomes true and (p is true 
forever, or y/ eventually becomes true and since then <p is no longer true. More de- 
rived temporal operators will be defined later on to treat specific examples. 



2.2 Events and Actions 

We now extend the temporal language L of CTL in order to represent events and 
actions. We do this by introducing a number of predicates on sorted arguments. 

We reify events, that is, we treat them as a sort of individuals, called event tokens. 
Every event token belongs to at least one event type, and takes place ( happens ) at 
exactly one time instant. We focus on a special kind of events, actions, which are 
brought about by an agent, called the actor of the action. 

In the following, variables e, e’, ..., will range on event tokens; variables x, y, ..., 
will range on agents; and variables t, t’, ..., will range on event types. We take 
Happ(e), Type(e,t) and Actotie,x) as primitive predicates, and define: 

Done{e,x,t) = def Happ{e) a Type{e,t) a Actor(e,x). 

The formula Done(e,x,t) expresses the fact that event e of type t is brought about 
by agent x. For the sake of convenience, at times we shall use the “m-dash” character 
to express existential quantification, as in the example below: 

Done{e,-,t ) = lef 3.r Done(e,x,t). 

The semantics of L has now to be enriched to account for the interpretation of the 
extended language. This can be done by: adding a typed domain D of individuals to 
every model M; defining an interpretation of first-order terms into /); defining an 
interpretation of primitive predicates in D\ and defining the semantics of the first- 
order quantifiers V and 3. In this paper we do not develop these technical aspects in 
details, and thus rely on the reader’s intuition for the interpretation of first-order ex- 
pressions. 

As usual, we now need to introduce a number of axioms to constrain the interpreta- 
tion of primitive predicates. It should be noted that such axioms do not alter the struc- 
ture of temporal frames, but reduce the set of allowable models by putting constraints 
on the interpretation of terms and predicates. Validity of formulae must then be un- 
derstood with respect to the class of CTL models that satisfy such constraints. 

As we already said, the instant at which an event takes place on a path is unique. 
We therefore adopt the axiom 

Ilapp(e) -a X G -iHapp(e) a AX + G + ~<Happ(e). (UH) 



3 Commitment 

3.1 Representing Commitments 

We define a commitment as a social state between agents including three compo- 
nents: 
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• the debtor, that is, the agent that is committed; 

• the creditor, that is, the agent relative to which the debtor is committed, 

• the content, that is, the state of affairs to which the debtor is committed relative to 
the creditor. 

A commitment is said to be a p recommitment when it has been proposed, but not 
yet accepted or refused. In such a case, we say that the (potential) debtor is precom- 
mitted to the (potential) creditor. In our treatment, both precommitments and actual 
commitments arise from the performance of communicative acts. 

We view (social) commitment as a deontic state, akin to obligation. For such a rea- 
son, it is essential to define when a commitment is fulfilled and when it is violated. 
We shall give the relevant formal definitions in the following subsections. However, 
in this paper we do not investigate what is going to happen when a commitment is 
fulfilled or violated (e.g., in terms of agent reputation, sanctions, etc.). These are im- 
portant aspects of multiagent systems management, but go beyond the conceptual 
definition of commitment. 

We now extend our formal language to accommodate for the treatment of com- 
mitments. The resulting language will be called Semantic Language, given that its 
purpose is to define the semantics of ACL messages. To represent a commitment, we 
need to represent a debtor, a creditor, and a content. Debtors and creditors are agents, 
and shall be represented by first-order terms of sort agent like we already did in Sub- 
section 2.2. The representation of content is more critical. It seems to us that there are 
basically two possibilities: 

• The content can be represented by a formula of the Semantic Language. In this 
case, commitment can be represented through a modal operator, analogously to the 
deontic logic representation of obligation. 

• The content can be represented as a first-order term. In this case, a commitment 
can be represented by a first-order formula. 

We believe there are at least two reasons to adopt the latter solution. The first, ob- 
vious reason is that the technicalities required by a predicative representation are 
simpler than the ones required by a modal representation. The second, more impor- 
tant, reason is that in the context of agent communication the content of a commit- 
ment, as we shall see later on, is always derived from an agent message. More pre- 
cisely, a commitment’s content derives from a statement in some Content Language 
(CL): think for example of the value of the : content parameter in KQML or FIPA 
ACL messages. With respect to a CL, the Semantic Language we are presently defin- 
ing can be viewed as a meta-language. It is therefore feasible to represent a CL state- 
ment by a first-order term of the Semantic Language. Such a first-order term may be 
viewed as the representation of the abstract syntax of a concrete CL statement. Of 
course, in the Semantic Language it is not sufficient to represent the syntax of a CL 
statement: we also need to represent its semantics. To do so, we shall assume that: 

• The abstract syntax of any CL statement can be represented by a first-order term of 
the Semantic Language. 

• If u is such a term, then the meaning of the corresponding statement is represented 
by a formula of the Semantic Language, which we shall denote by LuJ. In other 
words, LuJ is a truth-preserving translation of u into a formula of the Semantic 
Language. For such a translation to be possible the Semantic Language will have to 
include enough predicate, function, and constant symbols to represent the meaning 
of CL statements. 
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We introduce two predicates, Comm and Prec, to represent commitments and pre- 
commitments. In particular, 

Comm(e,x,y,u) 

will mean that event e has brought about a commitment for agent x, relative to agent 
y, to the truth of \_u\. When the above formula is true, we shall say that e is a commit- 
ment-inducing event. Precommitments are represented analogously: 

Prec(e,x,y,u ) 

will mean that event e has brought about a precommitment for agent x, relative to 
agent y, to the truth of |_m_|. 

Under given conditions, that we shall analyze later on, commitments can be made 
or cancelled, and precommitments can be made, cancelled or accepted (i.e., turned 
into actual commitments). This is possible thanks to the performance of tokens of 
suitable action types, formally defined in the next subsection: make commitment 
(me), make precommitment imp), cancel commitment (cc), cancel precommitment 
(cp), and accept precommitment ( ap ). Such actions, as we shall see later on, are per- 
formed by exchanging messages in an ACL. 



3.2 A Logical Model of Commitment 

The action types for commitment manipulation are defined by axioms describing their 
constitutive effects, that is, by describing the state of affairs that necessarily hold if a 
token of a given action type is successfully performed. 

Make Commitment 

Done(e,-,mc(x,y,u )) — » A (Comm(e,x,y,u) Z* Done(-,-,cc(e ,x,y ,u))) . (MC) 

Axiom MC says that: 

if an agent (not necessarily x or y) successfully performs an action of making a 
commitment with x as the debtor, y as the creditor, and u as the content, 
then on all paths x is committed, relative to y, to content u, 

until an agent possibly cancels such a commitment, after which the commitment no 
longer exists. 

It is important to remark that Axiom MC only defines what making a commitment 
means. It does not establish in what way, and under what conditions, an agent may 
actually make or cancel a commitment in a concrete situation. This aspect will be 
dealt with in Section 4. 

Make Precommitment 

Done(e,-,mp(x,y,u )) — > A (Prec(e,x,y,u) Z + ( Done(--,-,ap(e,x,y,u )) (MP) 

v Done(-,-,cp(e,x,y,u)))). 

Axiom MP is analogous to MC. 

Accept Precommitment 

Done(e\-,ap(e,x,y,u)) a ->Done(-,-,cp(epc,y,u)) — > (AP) 

A (Comm(e’ ,x,y,u) Z + Done(-,-,cc(e’ ,x,y,u))). 
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Axioms AP says that: 

if an agent successfully performs an action of accepting a precommitment brought 
about by event e, with debtor x, creditor y, and content u, 
and no agent has just cancelled such a precommitment, 

then the action of acceptance brings about on all paths a commitment for x, relative 
to y, to content u, which will stand until it is possibly cancelled. 

Again, this axiom does not say by what means or under what conditions an agent 
may actually accept a precommitment in a concrete situation. 

The next axiom assures that an event, which takes place at a certain instant, can 
(pre)commit agents only from that moment on. No (pre)commitment is retroactive: 
Happ{e) — > XG~(->Prec(e,x,y,u) a -i Comm(e,x,y,it)). 

Finally, the next axiom states that all (pre)commitments are necessarily brought 
about by some event: 

Prec(e,x,y,u) v Comm(e,x,y,u ) -4 F ~Happ{e). 



3.3 Fulfillment and Violation 

Intuitively, a commitment is fulfilled when its content is true, and is violated when its 
content is false. However, given that we are working in the context of branching-time 
logic, the formal definitions are not trivial. 

Let us start with an informal example. Suppose that thanks to event e v agent a is 
committed, relative to agent b, to the content expressed by CL sentence u v whose 
intuitive meaning is “it will rain until midnight.” Suppose further that e, takes place at 
4:00 pm, and that it persistently rains from that time to 6:00 pm, inclusive. Intuitively, 
at 6:00 the commitment induced by e, is neither fulfilled nor violated (we shall say 
that the commitment is pending). Now consider two possible developments: 

• It goes on raining until midnight. In this case, the commitment induced by el is 
fulfilled at time 0:00 am. 

• At 6:01 pm it suddenly stops raining. In this case, the commitment induced by el is 
violated at 6:01 pm. 

In order to formalize these intuitions, two problems must be solved. The first prob- 
lem has to do with the temporal indexicality of content sentences. By this we mean 
that the truth of the sentence “it will rain until midnight” has to be evaluated with 
respect to the state at which the commitment is made (the point of speech , in Rei- 
chenbach’s terminology 1 [11]). On the other hand, to know whether the commitment 
is fulfilled or violated we have to wait until something relevant happens, that is, until 
the first state at which it stops raining, or the first state at which it is midnight (Rei- 
chenbach’s point of event). But then, and this is the second problem, what is the truth 
value of the content at a generic state (Reichenbach’s point of reference) lying be- 
tween the point of speech and the point of event? 



1 The German philosopher Hans Reichenbach proposed a famous model of verb tenses in 
Chapter 7 of his book Elements of Symbolic Logic. We adopt his terminology, but reinterpret 
it with some freedom. 
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We propose a solution in which: 

• content sentences are temporally de-indexicalized in a simple and uniform way, by 
conjoining their translation into the Semantic Language with an atomic formula 
setting the point of speech; 

• the truth value of a content sentence at a given point of reference is evaluated with 
respect to all paths starting from the point of reference. 

Fulfillment 

On the basis of our previous considerations, fulfillment can be formally defined as 
follows: 

Fulf(e,x,y,u ) = def Comm(e,x,y,u) a AF (Happ(e) a (FC) 

To understand this definition correctly, it is helpful to go back to our previous ex- 
ample. Let us assume that 

[mJ = ( rain U + midnight), 

and suppose that the commitment-inducing event e l takes place in model M at state s : 
M,s |= Happ(e h ), 

M,s |= A ( Comm(e v a,b,u 1 ) Z * Done{-,-,cc(e v a,b,u 1 ))). 

Now consider an arbitrary state s’ in the future of s. We have 

M,s’\= Fidf(e v a,b,uf) iff M,s’ j= Comm(e v a,b,uf) a A F~ )Flapp)ef) a |_mJ). 

Let us assume that the commitment made at s has not been cancelled until s' (in- 
clusive). This implies that 

M,s’\= Comm(e v a,b,uf). 

Under such conditions, the commitment is fulfilled at s ’ iff 

M,s’\= AF ~{FIapp(e l ) a 

that is, iff for all pe Paths(s ’), 

M,p j= F ~(FIapp(ef) a |_mJ). 

Therefore, for the commitment to be fulfilled at s’, the formula 
Happ(e j) a ( rain IT midnight) 

must be true at some state in the past of s’. Now, thanks to Axiom UH (Section 2.2) 
we know that on every path the state at which an event takes place is unique. Thus, 
for the commitment to be fulfilled at s ’, the formula 
( rain U + midnight) 

must be true, for all pePaths(s) going through s’. A model satisfying these require- 
ments is depicted in Figure 1 . 

This example shows how statement u ] is de-indexicalized by evaluating it in the 
state s at which event e, took place. 

Moreover, the definition of fulfillment at s’ considers the truth value of .y on all 
paths starting from s and going through s’. The set of such paths typically becomes 
smaller when s ’ is moved further in the future of s. As a consequence, a commitment 
that is not yet fulfilled at .v may be fulfilled at some state s ’ in the future of .9. 
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Fig. 1. Formula rain U + midnight is true on all paths starting from s and going through s’. 
Violation 

Analogously to fulfillment, we can define violation as follows: 

Viol(e,x,y,u ) = def Comm(e,x,y,u ) a AF“ (Happ(e) a (VC) 

Pending commitments 

A commitment is pending iff it is neither fulfilled nor violated: 

Pend(e,x,y,u ) = dtif Commie, x,y,u) a -i Fulf{e,x,y,u ) a -i Viol{e,x,y,u). (PC) 

Thanks to Axiom UH (Section 2.2), from the above definition we can derive: 
j= Pend{e,x,y,u ) Comm{e,x,y,u ) a EF~{Happie)A^u\) a EF~(Happ(e)/\-[_u\). 

However, Definition PC raises a fairly subtle formal problem, which we shall ana- 
lyze in the next subsection. 



3.4 Some Properties of Commitment 

We shall now try to show that the axioms and definitions given in the previous sub- 
sections determine a satisfactory “logic of commitment.” 

Let us start with a few notes on fulfillment and violation. It is easy to see that if a 
commitment is introduced through a make commitment or accept precommitment 
action and later cancelled, it can no longer be fulfilled or violated. This is a direct 
consequence of Axioms MC and AP, and of Definitions FC and VC. Even if a com- 
mitment has already been fulfilled or violated in the past, it is no longer fulfilled or 
violated after it is cancelled. It is possible, however, to express the idea that a com- 
mitment has been fulfilled or violated in the past, by using the F operator. It would 
also be possible to constrain cancel commitment actions so that commitments that 
have already been fulfilled or violated can no longer be cancelled. 

Some commitments can be fulfilled, but can never be violated in a finite period of 
time. An example is a commitment whose content, translated into the Semantic Lan- 
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guage, is F* rain. Analogously, some commitments can be violated but never fulfilled 
in finite time. Consider for example a commitment to G rain. 

All commitments whose content is logically valid are immediately fulfilled. 
Dually, all commitments whose content is logically contradictory are immediately 
violated. Moreover, all commitments whose point of event is in the past of the point 
of speech are immediately fulfilled or violated. 

From the definitions of Section 3.3, every commitment is either fulfilled, or vio- 
lated, or pending, and these three states are mutually exclusive. In fact it is possible to 
prove that 

j= Comm(e,x,y,u) — » xor {Fulf(e,x,y,u),Viol(e,x,y,u),Pend(e,x,y,u)); 
that is, exactly one of Fulf(e,x,y,u), Viol{e,x,y,u), or Pend{e,x,y,u) is true in all models 
at every state at which Comm(e,x,y,u ) holds. 

This result, however, should be interpreted with some care. To show why, let us go 
back again to our example of Section 3.3. Suppose that thanks to event e v agent a is 
committed, relative to agent b, to the fact that it will rain until midnight; that e l takes 
place at 4:00 pm; and that it persistently rains from that time to 6:00 pm, inclusive. As 
we have remarked in the previous subsection, at 6:00 the commitment induced by e is 
intuitively pending. 

However, without further assumptions it is not possible to prove this. The reason is 
that there are models of the Semantic Language in which the commitment is not pend- 
ing, but fulfilled. Consider for example a one-path frame, and assume that the atomic 
formula rain is true at every state. In such a model, the commitment to the fact that it 
will rain until midnight is fulfilled as soon as it is made. Given that in some models 
the commitment is fulfilled, it is not possible to prove that it is pending. 

The problem has nothing to do with our definitions. Rather, it derives from the fact 
that certain intuitions about the world are not represented in the Semantic Language. 
In this case, the intuition is that rain is contingent, in the sense that it is always logi- 
cally possible that it rains or that it does not rain at the next state. If we want to carry 
this intuition into the Semantic Language, we need to exclude all models that do not 
meet it. This can be done by assuming the following contingency axiom for rain: 

EX 4 rain a EX 4 -'rain. 

Of course, this axiom does not belong to a logical model of commitment, but 
represents a fragment of domain knowledge. Such knowledge has to be expressed in 
terms of suitable axioms if we want to derive properties of commitments that square 
with our intuitions about the world. 

So far we said nothing about the behaviour of the commitment predicate with re- 
spect to the structure of content. To do so, however, we must make some assumptions 
about the abstract syntax of the CL. Let us assume that the CL allows for the Boolean 
connectives, quantifiers and temporal operators that, for the sake of simplicity, we 
will represent by the same symbols we use in the Semantic Language. 

Now consider the formula 

Commie pc, y, u a v). 

The question is: if e commits x, relative to y, to u a v, does it also separately com- 
mit x to u and to v? In fact, our logic does not allow us to derive Comm(e,x,y,u ) or 
Comm(e,x,y,v) from Commie pc, y, u a v). It turns out, however, that we do not need to 
add anything to our axioms and definitions to obtain a satisfactory behavior of com- 
mitment with respect to conjunction. Indeed, it is easy to see that 
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j= Comm(e,x,y, u a v) a AF (Happ(e) a — J_mJ) — > Viol(e,x,y, u a v), 

|= Commie pc, y, u a v) a AF~ (Happ(e) a — J_vJ) — > Viol(e,x,y, u a v). 

The validity of these formulae allows one to say, in informal speech, that 
if a debtor is committed, relative to a creditor, to the conjunction of u and v, 
then the debtor is committed, relative to the creditor, to both u and v, 
in the sense that the falsity of either u or v implies a violation of the original com- 
mitment. 

Another interesting problem is given by the treatment of conditional commitments, 
that is, commitments that become active provided some condition holds. Conditional 
commitments are not trivial to define in terms of the material conditional, and are 
often given an ad hoc treatment (see for example [6,13]), not dissimilar from the 
treatment of conditional obligation in deontic logic. To see where difficulties come 
from, let us see what happens if a conditional commitment is simply defined as a 
commitment with a conditional content. Suppose for example that event e l commits 
agent a, relative to agent b, to the fact that if a lightning is seen, a thunder will be 
heard immediately after. Further suppose that event e l takes place in model M at state 
s (i.e., s is the point of speech), and let formula 

Comm(e v a,b, lightning — » X* thunder) ( 1 ) 

express such a commitment. The obvious problem is that the commitment expressed 
by Formula 1 is immediately fulfilled if no lightning is seen at the point of speech, 
because 

AF (Happ(e x ) a ( lightning — > X + thunder)) 

is true at s if lightning is false at s. This problem, however, is not due to a limitation 
of material conditional, but to the fact that Formula 1 does not correctly represent the 
content of the commitment. In fact, the statement to which a commits may be inter- 
preted in two ways: (i), “always in the future, a thunder will be heard immediately 
after a lightning is seen;” or (ii), “as soon as a lightning will be seen, a thunder will be 
heard immediately after.” With the first interpretation, a’s commitment is represented 
by 

Comm(e v a,b, G* (lightning — » X + thunder)). (2) 

The commitment of Formula 2 can never be fulfilled in finite time, and is violated 
at state s ’, in the future of s, iff 

AF~ (Happ(e j) a — iG + (lightning — > X + thunder )) 
holds at s ’, that is, iff 

AF (Happ(e^) a F + (lightning a —DC thunder)) 

holds at s’. In other words, the commitment of Formula 2 is violated in the future 
of s as soon as on all paths starting from s and going through the current state it is the 
case that a lightning will be seen that is not immediately followed by a thunder. 

With the second interpretation, the commitment is expressed by 

Comm(e v a,b, lightning S + X + thunder), (3) 

where the “as soon as” operator S + is defined as below: 

L u s + vj = def (|_mJ LvJ) a (X + (L«J -> LvJ) w + |_w_|). 
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The commitment of Formula 3 is fulfilled at a state s’, in the future of s, iff 
AF (Happ(e^) a (( lightning X + thunder) 

a (X* (lightning — > X*thunder)\N* lightning)) 
holds at s’. This formula becomes true at state s’ if, and only if, for all paths starting 
from s and going through s ’ the first occurrence of a thunder after s is immediately 
followed by a lightning. Moreover, as it can easily be checked, the commitment of 
Formula 3 is violated at a state in the future of s as soon as it is the case that a light- 
ning will be seen that is not immediately followed by a thunder. 

These examples suggest that a satisfactory logic of commitment is induced by 
Definitions FC and VC, which specify the conditions under which a commitment is 
fulfilled or violated. 



4 Communicative Acts and ACL Messages 

In the previous section we have defined the results of a number of commitment- 
manipulation actions, but we have not yet explained how these actions can be per- 
formed. The idea is the following: agents can perform commitment-manipulation 
actions by exchanging ACL messages, provided certain contextual conditions hold. 

We consider as the fundamental unit of agent communication the exchange of a 
message. By this we mean that a message is sent by an agent, the sender, and received 
by another agent, the receiver. In turn, a message is viewed as a pair made up by a 
type indicator and a body. Type indicators (corresponding to KQML’s performatives) 
are constant symbols taken from a finite set, whose definition is part of the ACL 
specification. The body can be a sentence in some CL, whose abstract syntax is repre- 
sented in our Semantic Language by a first-order term (see Section 3), or a more 
complex structure (for example a tuple of elements), typically including a CL sen- 
tence. When event e is an exchange of a message of type rand body a, sent by agent 
x to agent y, we write: 

Done(e,x,exch(y, T.O)). 

Under given conditions, such an event implies a valid performance of a commit- 
ment-manipulation action. It is important to note that by associating commitment 
manipulation actions to messages, we formally specify a commitment-based seman- 
tics for an ACL. More precisely, the meaning of message ( z,d) is defined as the effect 
that exchanging (z,d) has on the network of commitments binding the sender and the 
receiver. By defining a coherent set of message types in this way, it is possible to 
specify a Communicative Act Library with its associated semantics. 

Below we analyze a few examples. 

Informing 

We assume that the body of an inform message is an arbitrary CL sentence. Informing 
is then defined as committing to the truth of the message body. More precisely, when 
agent x exchanges with agent y a message of type inform with an arbitrary CL sen- 
tence s as the body, agent x commits, relative to y, to the truth of s: 



Done(e,x,exch(y, inform, s)) — > Done(e,x,mc(x,y,s)). 



(Inf) 
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Requesting 

We assume that the body of a request message is an action expression , which de- 
scribes the requested action by indicating its type, its actor, and possibly a temporal 
constraint. Concrete action expressions belonging to a specific CL should not be con- 
fused with the first-order term representing the abstract syntax of the expression in the 
Semantic Language. For example, here is an example of a possible concrete action 
expression describing the action type of actor ag-1 moving object obj -1 from 
location loc - 1 to location loc - 2 before end - of - turn: 

(action : actor ag-1 

: type (move : object ob j - 1 
:from loc-1 

: to loc-2) 

:deadline end-of-turn) 

The abstract syntax of this expression is given by the first-order term 
u j = before{done(ag v move(obj v Ioc v Ioc 2 )), encl-of-tum), 
which in turn can be translated into the Semantic Language formula 
L«J= Done(-,ag 1 ,move(obj v loc v loc 2 )) B + end-of-turn, 
where 

(P B> = def 

With these assumptions, if term s represents the abstract syntax of an action ex- 
pression, the semantics of a request message is defined by: 

Done{e,x,exch(y,request,s )) — > Done(e,x,mp(y,x,s )). (Req) 

Accepting 

We define accepting not only with respect to requests, but with respect to precom- 
mitments in general. We assume that the body of an acceptation message is a tuple 
including all the elements that uniquely identify the accepted precommitment: 

Done{e ’ ,y,exch(x,accept,(e,y,x,s)))f\Prec(e,y,x,s) — > Done{e ’ ,y,ap(e,y,x,s )). (Acc) 
Ordering 

The difference between a request and an order is that while requests can be accepted 
or refused, orders cannot. In the terms of our approach, a request brings about a pre- 
commitment, and an order directly generates a commitment. To issue an order an 
agent must have powers that are not required to simply make a request; however, 
developing an articulated model of power relationships lies beyond the scope of this 
paper. 



5 Discussion 



In this paper we have presented a logical model of social commitment embedded in 
CTL j a logic of discrete time with no start or end points and branching in the future. 
The logical model of commitment has been completely specified at the level of formal 
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semantics, and this has allowed us to prove some properties of commitment, ex- 
pressed by valid formulae of our Semantic Language. 

Some of the above issues are dealt with also by other authors in this volume. In 
particular, we would like to discuss the differences and similarities of our approach 
with respect to the works by Jones and Parent, Mallya et al., and Chopra and Singh. 

Jones and Parent 

Commitment-based approaches to ACL semantics have recently been criticized by 
some authors, and in particular by Jones and Parent (this volume). 

A basic criticism is that it is not clear what it means for an agent to commit to the 
truth of a statement. For example, according to Jones and Parent (Section 1), “Should 
not the propositional content of a commitment be a future act of the speaker? If so, to 
what action is [an agent] preparing to commit himself, when asserting p?” 

It seems to us that Jones and Parent take commitment as essentially equivalent to 
obligation. However, this is not the intended meaning of the term ‘commitment’ as 
used in our proposal. Consider for example a person swearing that something is the 
case: here we have an example of a (very strong) commitment to the truth of a sen- 
tence, which may or may not be the description of a future act of the speaker. Com- 
mitting to the truth of a sentence, s, simply means that the debtor of the commitment 
will be in a state of fulfillment if s is settled true, in a state of violation if s is settled 
false, and in a pending state if the truth value of s is still undetermined. This definition 
does not assume the speaker’s obligation, or even the speaker’s ability, to defend the 
content of his or her assertions. 

In section 2.4, Jones and Parent suggest that it is not acceptable to assume that an 
agent commits relative to “everyone to whom he addresses his assertion,” because 
“there may well be members of the audience who do not care whether [the speaker] is 
sincere, and there may be also others who require [him] to be insincere.” Again, there 
seems to be a misunderstanding here. Commitment has nothing to do with sincerity. 
When an agent makes an assertion, it commits to its content, in the sense that the 
agent will enter a state of fulfillment if the content of the assertion is settled true, and 
a state of violation if the content of the sentence is settled false. This is an objective 
fact, in that it solely depends on the conventions underlying assertions, and has noth- 
ing to do with sincerity and/or the audience’ interest in what is asserted. 

Let us now analyze the example that Jones and Parent consider problematic for 
commitment-based semantics. An agent, a, informs a group of agents, say 
{b 0 ,b 1 ,...,b n }, that p 2 . However, there is a previous private agreement between a and 
b 0 , unknown to {b v ...,b t }, to the extent that a is going to lie. If we were dealing with 
human language, the solution to this puzzle would be very simple and straightfor- 
ward. In human communication, communicative acts are not defined only in terms of 
static conventions, but depend on the speaker’s intentions, and on the recognition of 
such intentions by the hearers. An utterance with an assertive surface form (i.e., in the 
indicative mood) is not going to count as an assertive illocutionary act if there is a 
previous agreement between the speaker and the hearer that bypasses usual conven- 
tions. A similar situation occurs, for example, when two actors carry out a dialogue on 
the stage. As far as artificial agent communication is concerned, however, we agree 



2 Indeed, Jones and Parent deal with acts of asserting, and we deal with acts of informing. 
However, both types of communicative acts belong to the category of assertives, and the dif- 
ference between them is irrelevant here. 
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with Jones and Parent that it is advisable to forget about intentions and stick to prede- 
fined conventions. The simplest way to deal with Jones and Parent’s puzzle, then, 
would be to assume that agent a actually does enter a state of commitment, relative to 
b 0 , which will be automatically cancelled by the agreement between a and b 0 . A 
slightly more complex but more elegant solution would be to treat the agreement 
between a and b 0 as setting up a special institutional context, in which a’s assertive 
message that p will not count as making a commitment to p relative to b 0 . 

Jones and Parent claim (Section 2.4) that “the move towards agent commitment [...] 
is the result of a confusion” between preservative norms (or regulative rules, as we 
prefer to call them) and constitutive conventions. Well, we hope not, because we also 
consider this distinction to be fundamental. All the rules connecting the messages to 
commitments are constitutive conventions: they say that messages of certain forms 
count as certain operations on commitments. We have not yet worked on the regula- 
tive component of communication, which in particular will have to include the regula- 
tions for the management of violations. Such regulations, contrary to the constitutive 
conventions we have described in this paper, are likely to be strongly application 
dependent. 

Our comments so far are intended to correct some misunderstandings concerning 
the commitment-based approach. On the other hand, it is probably too early to estab- 
lish the relative merits and drawbacks of this approach in comparison with Jones and 
Parent’s optimality-based proposal. Indeed, the two approaches have at least one point 
in common: they both consider agent communication as fully conventional, and re- 
gard a false assertion as a kind of violation. In our approach, a false assertion is the 
violation of a commitment; in Jones and Parent’s approach, it is a violation of the 
optimality of the signaling system, with respect to its function of facilitating the 
transmission of reliable information. 

Mallya et al. 

Mallya et al. contribute to this volume with a work that focuses on commitments as a 
key component of agent interaction. The authors define commitment as including a 
debtor, a creditor and a content, and also a number of related operations and predi- 
cates that present some analogies to what has been presented in our work. Among the 
six commitment operations that the authors define in Section 2.3, CREATE can be 
mapped to our me (make commitment), and RELEASE corresponds to our cc (cancel 
commitment). Instead, Mallya at al. define CANCEL as an operation that only the 
debtor of a commitment can perform, generally compensating that cancellation by 
making another commitment. From our point of view, as in the case of orders, to 
perform such a cancellation, agents need special powers that lie beyond the context of 
an ordinary communication framework. We have not considered the possibility of 
transferring commitments among agents, and thus the ASSIGN and DELEGATE opera- 
tions do not find any correspondence in our approach. The DISCHARGE operation 
raises the most critical issues, as it is related to the fulfillment and the violation of a 
commitment, which, in our opinion, are not dealt with in a satisfactory way. More 
specifically, in Section 3.2 the authors define the relevant predicates (satisfied(c) and 
breached(c), where c stands for a commitment) in terms of whether a DISCHARGE 
operation has been performed in the past or not. Still, the authors seem to bypass the 
problem instead of solving it, as in Section 2.3 they assume that “the DISCHARGE 
operation brings about p [the relevant commitment’s content], and conversely, if p 
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occurs, the DISCHARGE operation is assumed to have happened”. Such an assumption 
“discharges” the authors from defining the truth conditions of the content of a com- 
mitment, which, from our point of view, is one of the key points of the specification 
of a logical model of commitments. 

Chopra and Singh 

Chopra and Singh’s contribution is also mainly dealing with commitments. In particu- 
lar, the authors develop a formalism based on commitments to specify interaction 
protocols. Such formalism relies on the Non-monotonic Causal Logic (NCL) as the 
authors take into account the fact that protocol specifications often have to deal with 
defeasible reasoning. We also think that, in general, defeasible reasoning plays an 
important role in communicative interactions. For example, we assume that an agent 
has special powers to bring about particular (communicative) events if and only if 
such condition is explicitly stated. This shows that the definitions dealing with powers 
rely on a closure assumption, which inevitably requires some form of non-monotonic 
reasoning. Our main concern about the work by Chopra and Singh is that it does not 
provide enough evidence for the advantages of choosing NCL instead of other 
non-monotonic reasoning schemes described in the literature. 



6 Future Work 

Even if social commitment has already been proposed [2,12] as a basis for the defini- 
tion of ACL semantics, no full formal account of commitment has been put forward 
so far. Needless to say, we are just at the beginning of a long way. Below we point out 
some aspects that need to be further investigated. 

Time 

A sound and complete formal system for CTL has to be developed. This result 
should be easy to achieve by extending some known formal system for CTL'. It would 
also be important to develop efficient model checking techniques for at least a sub- 
language of CTL 1 . Moreover, it may be worthwhile to consider an extension of CTL 1 
dealing with dense time, in order to give a more flexible account of interactions in a 
multiagent system. 

Another important aspect is the expression of temporal qualifications in content 
sentences. Indeed, CTL - is a powerful but very abstract temporal language. In many 
practical applications, like for example in the field of e-business, we can expect that 
temporal qualifications will be expressed with respect to some standard date system, 
like the Gregorian calendar. The critical point here is to specify a language by which 
common temporal qualifications can be represented in a natural and transparent way 
(see for example [10]). 

Action 

In this paper we have defined a minimal set of logical tools for the treatment of ac- 
tion. We feel, however, that it might be worthwhile to embed our logic of commit- 
ment in a richer language, possibly based on some version of dynamic logic. 

An important point in our treatment is the association between an action and its re- 
sults. In the case of commitment, this association has been represented by inserting an 
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event-denoting term as the first argument of the Comm and Prec predicates. This 
solution has proved sufficient for our current goals, but may be difficult to extend to 
more complex situations. 

Commitment 

The main contribution of this paper is the logical treatment of commitment. Commit- 
ment is intrinsically a second-order concept, in that an agent commits to a proposition. 
Driven by a concern for simplicity, we decided to represent a commitment by a first- 
order predicate, and its content as a first-order term. 

In designing our representation of commitment we have constantly kept in mind 
the reasons that motivate the development of a logical model in an area of Computer 
Science. In our opinion, the rigor and precision given by the use of logic is highly 
valuable, but should never bring us too far from practical applications, lest we give up 
the hope of influencing actual software practice. 

We believe that our model of commitment can easily be translated into the concep- 
tual toolkit and jargon of software designers. More precisely, commitments may be 
viewed as instances of a “commitment class,” whose instance variables contain: a 
reference to the commitment-inducing event (a message exchange), two references to 
agents (the debtor and the creditor), and an abstract representation of a CL sentence. 
In such a context, the commitment manipulation actions can be regarded as methods 
of the commitment class, with formal specification given by Axioms MC, MP, and 
AP of Section 3.2. Continuing this line of thought, the definitions of fulfillment and 
violation can be viewed as the core specification of a “commitment management 
system,” which may be in charge of monitoring communicative exchanges in a multi- 
agent system. Finally, the examples of Section 4 suggest that a Communicative Act 
Library may define a communicative act by specifying: (i), the general form of the 
class of messages by which the act is performed; (ii), relevant contextual conditions 
for a successful execution of the communicative act; and (iii), the effect of a success- 
ful execution of the communicative act, expressed in terms of commitment- 
manipulation actions. A first proposal in this direction is presented by Fornara and 
Colombetti in [6] and [7]. However, there are still a number of discrepancies between 
the logical model presented in this paper and the above-mentioned operational model. 
Integrating the two levels of our model of agent communication is a major goal for 
our future research. 
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Abstract. This paper proposes a formal framework which offers an external 
representation of conversations between conversational agents. Using this for- 
malism allows us: (1) to represent the dynamics of conversations between 
agents; (2) to analyze conversations; (3) to help autonomous agents to take part 
in consistent conversations. The proposed formalism, called “commitment and 
argument network”, uses a combined approach based on commitments and ar- 
guments. Commitments are used to capture the social and the public aspect of 
conversations. Arguments on the other side are used to capture the reasoning 
aspect. We also propose a layered communication model in which the formal- 
ism and the approach take place. 



1 Introduction 

In the multi-agent domain, it is widely recognized that communication between 
autonomous agents is a challenging research area that involves several disciplines: 
philosophy of language, social psychology, artificial intelligence, logics, mathematics, 
etc. In a multi-agent system, agents may need to interact in order to negotiate, to solve 
conflicts of interest, to cooperate, etc [16]. All these communication requirements 
cannot be fulfilled by simply exchanging messages. Agents must be able to take part 
in coherent conversations which result from the performance of coordinated speech 
acts [31]. 

Three main approaches have been proposed to model communication between 
software agents in general and to define a semantics for agent communication lan- 
guages (ACLs). These approaches are: the mental approach, the social approach, and 
the argumentative approach. 

In the mental approach, so-called agent’s mental structures (e.g. beliefs, desires 
and intentions) are used to model conversations and to define a formal semantics of 
speech acts. In the first system that was based on these notions, speech acts were 
planned like non-communicative actions [10]. It was used by [21] and [22] to define a 
formal semantics of KQML. However, this semantics has been criticized for not being 
verifiable because one cannot verify whether the agents’ behavior matches their pri- 
vate mental states [13] [5]. This approach is used in the development of a pragmatic 
for agent communication based on cognitive coherence theory [29], 
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An alternative to the mental approach was proposed by [33] under the name of so- 
cial approach. In contrast to the mental approach, this approach emphasizes the im- 
portance of conventions as well as the public and social aspects of conversations. It is 
based on social commitments that are thought of as social and deontic notions. Social 
commitments are commitments towards the other members of a community. They 
differ from the agent’s internal psychological commitments which capture the persis- 
tence of intentions as specified in the rational interaction theory [9], As a social no- 
tion, commitments are a base for a normative framework that makes it possible to 
model the agents’ behaviour. This notion has been used to define a formal semantics 
that is verifiable [32] [11] [36] [24]. The role of social commitments in modeling and 
specifying agent interactions is widely recognized. They are used in order to specify 
ACL protocols [18] and to represent these protocols by capturing interactions that 
describe new scenarios and by using causal logic [6], 

Another approach, called the argumentative approach, was proposed by [2] as a 
method for modelling dialogue. It also has been used to define a semantics of some 
communicative acts [1] and to define protocols [26] [28]. It is based upon an argu- 
mentation system in which the agents’ reasoning capabilities are often linked to their 
ability to argue. They are mainly based on the agent’s ability to establish a link be- 
tween different facts, to determine if a fact is acceptable, to decide which arguments 
support which facts, etc. The approach relies upon the formal dialectics introduced by 
[20] and [23]. Dialectical models are rule-governed structures of organized conversa- 
tions in which two parties (in the simplest case) speak in turn in an orderly way. 

Recently, researchers have begun to address the issues raised by conversation poli- 
cies. According to [25] two approaches can be distinguished: Commitment-based 
protocols and dialogue-game based protocols. The first approach uses social commit- 
ments to specify the sequences of utterances. The second one considers that protocols 
are captured within appropriate structures that can be combined in different ways to 
form the global structure of a dialogue [12]. 

Despite all this research works focused on modeling dialogue and semantic issues, 
few researchers have addressed the issue of representing the dynamics and the coher- 
ence of conversations. The purpose of this paper is to propose a formal framework 
that can represent agent actions likely to take place in a conversation. These actions 
are interpreted in terms of creation and of positioning on social commitments and 
arguments. The proposed formalism allows us to model the dynamics of conversa- 
tions and offers an external representation of the conversational activity. This notion 
of external representation [7] is very useful because it provides conversational agents 
with a common understanding of the current state of the conversation and its ad- 
vancement. An example of such an external representation is the conversational 
model proposed by [30]. Based on our formalism, a model is made available to the 
agents and they can access it simultaneously. The formalism also allows us to ensure 
conversational consistency when considering the actions performed by the agents. 
Called "commitment and argument network" (CAN) our formalism relies on an ap- 
proach combining commitments and arguments [3], This approach has the advantage 
of capturing both the social and public aspects of a conversation, and the reasoning 
aspect required in order to take part in coherent conversations. The formalism can 
clearly illustrate the creation steps of new commitments and the positioning steps on 
these commitments, as well as the argumentation and justification steps. This formal- 
ism supposes that conversational agents are able to manipulate commitments and 
arguments. Therefore, the agents architecture must take into account this aspect. 
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In Section 2 we present our vision of a communication model. In Section 3 we dis- 
cuss a model of social commitments which is a part of our communication model, and 
we show how speech acts can be interpreted as actions on these commitments. In 
Section 4 we introduce the argumentation aspect and we illustrate the link between 
commitments and arguments. The foundations of the CAN formalism are presented in 
Section 5. We also give an example of the analysis of a dialogue and we show how 
our formalism can be used either to analyze a conversation or as a means that allows 
agents to take part in conversations. 



2 A Communication Model 

The model that we propose combines the three approaches discussed in the introduc- 
tion. It is based on a hybrid approach that we call MSA (Mental-Social- Argumenta- 
tive). Indeed, if they are taken individually, the three approaches introduced earlier do 
not allow us to model all the aspects of conversations. For this reason, we suggest to 
combine them in a unified approach. In addition, the conversation is a cognitive and 
social activity which requires a mechanism making it possible to reason on mental 
states, on what other agents say (public aspects) and on the social aspects (conven- 
tions, standards, obligations, etc). These three approaches are thus not exclusive but 
rather complementary. 

The MSA approach has the advantage of capturing simultaneously the mental as- 
pect that characterizes the agents participating in a conversation, the social aspect that 
reflects the context in which these agents communicate and the reasoning aspect 
which is essential to be able to take part in coherent conversations. The combination 
of commitments and arguments seems essential to us because agents must be able to 
justify the facts on which they are committed and to justify their actions on commit- 
ments. This justification cannot be made if the agents do not have the necessary ar- 
gumentation mechanisms. In addition, the combination of commitments and private 
mental states is necessary because public commitments reflect these mental states. 
Finally, the combination of argumentation and mental states is significant because 
agents have to reason on their mental states before committing in a conversation. 

The model of communication is composed of three layers: the conversation layer, 
the commitment/argument layer and the cognitive layer. This stratification in layers is 
justified by the abstraction levels. The conversation layer is directly observable be- 
cause it is composed of the speech acts that the agents perform. These acts are not 
performed in an isolated way, but within a particular conversation. The commit- 
ment/argument layer is used to correctly manage the social commitments and the 
arguments that are related to the conversation. These commitments and arguments are 
not directly observable, but they should be deduced from the speech acts performed 
by the agents. Finally, the cognitive layer is used to take into account the private men- 
tal states of the agents, the social relations and other elements that the agents use in 
order to communicate. In this paper we propose a formalism that is used to model the 
elements composing the second layer. 

In order to allow conversational agents to suitably use the communication model, 
this model must be compatible with the agent architecture. Thus, we propose an archi- 
tecture of conversational agent which is composed of three models: the mental model, 
the social model and the reasoning model (Fig. 1). The mental model includes beliefs, 
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desires, goals, etc. The social model captures the social concepts such as conventions, 
roles, etc. Social commitments constitute a significant component of this model. A 
social commitment is a participant public attitude relative to a proposition. It defines a 
particular relationship between a participant and a statement. The commitments that 
the agent makes public when performing speech acts are different from the private 
mental states, but these two elements are not independent. Indeed, social commit- 
ments reflect mental states. Thus, agents must use their reasoning capabilities to rea- 
son on their mental states before producing or manipulating social commitments. The 
agent’s reasoning capabilities are represented by the reasoning model via an argumen- 
tation system. The conversational agent model is formed by general knowledge, such 
as the knowledge on the conversation subject. This knowledge will be used by the 
agent in order to build the common ground that it must share with its partners. The 
notion of common ground introduced by the philosophers of language Clark and 
Haviland [8] indicates the set of knowledge, beliefs and presuppositions which the 
agents believe that they share during their conversations. 



The conversational agent architecture 




Fig. 1. The links between the conversational agent architecture and the communication model 



3 Social Commitment Formulation 

A social commitment is a commitment made by an agent (called the debtor), that 
some fact is true. This commitment is directed to a set of agents (called creditors ) [4]. 
The commitment content is characterized by time t^ which is different from the utter- 
ance time denoted t and from the time associated with the commitment and denoted 
t . Time t refers to the time during which the commitment is in vigor. It can corre- 
spond to a fixed value or an interval. When it is an interval, this time is denoted [tW , 

t ic P ]. When a temporal bound is instantiated, it takes a numerical value which re- 
spects the time unit used by the agents. We denote a social commitment as follows: 
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Definition 1: SC(id n , Ag p A* t sc , (p, t f/> ) 

where id n is an integer identifying the commitment, Ag k the debtor. A* the set of the 
creditors ( A*=A/(Ag 1 j, where A is the set of participants), t is the time associated 
with the commitment, (p its content and t the time associated with the content (p. To 
simplify the notation, we suppose throughout this paper that A=/Ag p Ag-,j. For ex- 
ample, the utterance: 

( Example 1 ) 

U: "I met agent Ag 3 on MSN one hour ago " 
leads to the creation of the commitment: 

SC(id n , Agp Ag 2 , t sc , Meet(Ag p Ag 3 , MSN), t sc -lh). 

The creation of such a commitment is an action denoted: 

Create(Ag p t u , SC(id n , Ag p Ag 2 , t sc , Meet(Ag p Ag 3 , MSN), t s -lh )). 

In general an action ACT performed by an agent Agj on a social commitment SC is 
denoted: 

Definition 2: Act(Ag p t u , SC(id n , Ag p Ag 2 , t sc , (p, t ( f) 

Example 1 illustrates that there is a mapping between a speech act and a social com- 
mitment. Singh [32] and Colombetti [11] propose a social semantics of speech acts 
using such a mapping. In our approach, we go beyond Singh’s and Colombetti’s mod- 
els and interpret a speech act as an action performed on a commitment in order to 
model the dynamics of conversations. This interpretation can be denoted by : 

Definition 3: SA(i k , Ag p Ag 2 , t u , U)\~ dif Act(Ag p t u , SC(id n , Ag p Ag 2 , t sc , (p, t^) 
where | — :hp means “is interpreted by definition as”, SA is the abbreviation of "Speech 
Act", i k the identifier of the speech act and Act indicates the action performed by the 
debtor on the commitment. The definiendum (SA(i k , Ag p Ag 2 , t u , U)) is defined by the 
definiens ( Act(Ag p t u , SC(id n , Ag p Ag-,, t sc , (p, t^j) as an action performed on a social 
commitment. The agent that performs the speech act is the same agent that performs 
the action Act. Act can take one of four values: Create, Withdraw, Violate and Fulfill. 
These four actions are the actions that the debtor can apply to a commitment. This 
reflects only the debtor’s point of view. However, we must also take into account the 
creditors when modeling a conversation which is, by definition, a joint activity. We 
thus propose modeling the creditors’ actions which do not apply to the commitment, 
but to the contents of this commitment This separation between the commitment and 
its content enables us to remain compatible with the semantics of commitments, i.e. 
the fact that only the debtor can handle its commitment. The semantics associated 
with this types of actions is expressed in terms of argumentation (see Section 4.2). 

Hence, we must differentiate between the actions applied on a commitment (Act) 
and the actions performed on the content of a commitment ( Act-content ). We denote 
an action applied on the content of a commitment as follows: 

Definition 4: Act-content(Ag k , t u , SC(id n , Ag jy Ag 2 , t sc , (p, tj) 

where i, jell, 2} and (k=i or k=j). Agent Ag k can thus act on the content of its own 
commitment (in this case we get k=i) or on the content of the commitment of another 
agent (in this case we get k=j). 
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In addition, the actions that can be carried out by the debtor on the commitment 
content are different from the actions that can be carried out by the creditor. The 
debtor can change the content of its own commitment, can defend it if the debtor 
refuses it or questions it. The creditor can refuse the content of another agent’s com- 
mitment, accept it or question it. 

Thus, a speech act leads either to an action on a commitment when the speaker is 
the debtor, or to an action on a commitment content when the speaker is the debtor or 
the creditor. When an agent acts on the content of a commitment created by another 
agent we refer to this as “taking a position on a commitment content”. However, it 
should be noted that the same utterance can lead both to taking a position on the con- 
tent of an existing commitment and to the creation of a new commitment. Generally, a 
speech act leads to an action on a commitment and/or an action on a commitment 
content. Formally: 

Definition 5: SA(Ag p Ag 2 , t u , U) \~ dif 

| Act(Ag p t u , SC(id, Ag p Ag 2 , t sc , <p t^) 

-s and/or 

Act-content(Ag k , t u , SC(id, Ag p Ag jt t sc , cp, t^) 
where i, j e { 1. 2} and (k=i or k=j). 



3.1 The Notion of State 

A commitment can evolve and be transformed as a result of the actions that the debtor 
performs on it (creation, withdrawal, violation and fulfillment). Its content may also 
be transformed following the actions that the debtor and the creditors apply to it 
(change, acceptance, justification, etc.). Therefore, the agents act on their own com- 
mitments and on the content of both these commitments and other agents’ commit- 
ments, which leads to their transformation. Hence the notion of state, which makes it 
possible to capture the evolution of commitments and their contents. However, we 
must distinguish between the notion of the commitment state [17] and the notion of 
the content state relative to this commitment as we propose here. Indeed, whenever an 
agent acts on its commitment, the commitment state is affected; whereas when the 
agent acts on the content of a commitment, the content state is transformed. Indeed, 
the notion of commitment state alone does not reflect the conversation dynamics since 
it only captures the debtor’s actions on its commitment. The two states (the commit- 
ment state and the content state of the commitment) reflect this dynamics. This notion 
is of great importance since it allows us to keep a trace of the dialogue evolution in so 
far as each speech act leads to an action performed on a commitment or on its content. 

Here are the states that we propose to use in our model. Once created, a commit- 
ment will take the active state and its content the submitted state. This expresses the 
fact that the content is presented for possible negotiation. A commitment can be in 
one of four states: active, fulfilled, cancelled , and violated. 

A commitment content can take six states: submitted, changed, refused, accepted, 
questioned and justified. These states and the operations which trigger them depend 
on the commitment type. Hence, the commitment state and the content state are two 
parameters which characterize this commitment at any moment. Thus, we need to 
revise the definition of a commitment ( Definition 1) by adding 3 new parameters. So, 
a social commitment is a 8-uple: 
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Definition 6: SC(id n , Ag,, Ag 2 , t sc , S, S content , (p, t (p) 

where S a vector presenting the various commitment states and S content a vector pre- 
senting the various content states. Using vectors as parameters for commitment and 
content states makes it possible to keep track of all the transitions that reflect the evo- 
lution of the commitments and their contents. 

3.2 Classification 

In the literature [38] [17], several commitment types have been proposed. Similarly to 
the classification suggested by [17] we distinguish absolute commitments, conditional 
commitments and commitment attempts. 

3.2.1 Absolute Commitments 

Absolute commitments are commitments whose fulfillment does not depend on any 
particular condition. Two types can be distinguished: propositional commitments and 
action commitments. 

Propositional commitments 

Propositional commitments are related to the state of the world. They are generally, 
but not necessarily, expressed by assertives. They can be directed towards the past, 
the present, or the future. We denote a propositional commitment as follows: 

Definition 7: PC(id n , Ag p Ag 2 , t pc , S, S contenr p, t p ) 
where p is the proposition on which Ag, commits. 

Action commitments 

Contrary to propositional commitments, action commitments (also called commit- 
ments to a course of action) are always directed towards the future and are related to 
actions that the debtor is committed to carrying out. The fulfillment and the lack of 
fulfillment of such commitments depend on the performance of the underlying action 
and the specified delay. This type of commitment is typically conveyed by promises. 
We denote an action commitment as follows: 

Definition 8: AC(id n , Ag,, Ag 2 , t ac , S, S contenr a, tj 
where a is the action to be carried out. 

3.2.2 Conditional Commitments 

Absolute commitments do not consider the conditions that may restrain their fulfill- 
ment. However, in several cases, agents need to make commitments not in absolute 
terms but under given conditions. Another commitment type is therefore required. 
These commitments are said to be conditional. The structure of a conditional com- 
mitment which must reflect the underlying condition, is different from the structure of 
a social commitment ( Definition 6). We denote a conditional commitment as follows: 

Definition 9: CC(id n , Ag,, Ag 2 , t cc , S, S content , (ft, t p )^f / t/) 

where => stands for classical implication. This commitment expresses the fact that if 
P is true (or carried out) at time t p , then Ag, will be committed towards Ag 2 to making 
/or that /is true at time t.. The addition of the symbol => in the formula enables us to 
better illustrate the implication relation existing between the condition and the action. 
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3.2.3 Commitment Attempts 

The commitments described so far directly concern the debtor who commits either 
that a certain fact is true or that a certain action will be carried out. For example, these 
commitments do not allow us to explain the fact that an agent asks another one to be 
committed to carrying out an action (by a speech act of a directive type). To solve this 
problem, we propose the concept of commitment attempt inspired by the notion of 
pre-commitment proposed in [11], We consider a commitment attempt as a request 
made by a debtor to push a creditor to be committed. Thus, when an agent Ag } re- 
quests another agent Ag 2 to do something, we say that the first agent is trying to in- 
duce the other agent to make a commitment. We denote a commitment attempt as 
follows: 

Definition 10: CT(id n , Ag v Ag 2 , t ct , S, S contenf (p, y 

where <p is the content of the commitment attempt. A commitment attempt is thought 
of as a type of social commitment because it conveys content which is made public 
once the attempt is performed. Flowever, in our approach, there is a true commitment 
only after the creditor agent reacts in response to the commitment attempt. The debtor 
and the creditor of a commitment attempt can act both on the attempt and on its con- 
tent. On the one hand, the creditor agent reserves the right to accept a commitment 
attempt definitively, to accept it conditionally, to refuse it or to suspend it by asking 
for a period of reflection. It can also question the content of a commitment attempt. 
On the other hand, the debtor agent can cancel a commitment attempt. It can also 
change the content of a commitment attempt and defend it. Like a social commitment, 
a commitment attempt can be related to a proposition, an action or a condition. The 
evolution of the states of commitments and of their contents as well as the different 
rules of manipulating the commitment attempts are detailed in [3], 



4 Argumentation 

In artificial intelligence, argumentation is used in two distinct ways: to structure 
knowledge or to model dialectical reasoning. The first approach aims at determining 
how utterances form arguments and how arguments can be decomposed. This ap- 
proach has been used in Toulmin’s model [35]. On the other hand, the second ap- 
proach deals with argument construction. Models suggested for example in [14] et 
[15] follow this approach. When considering dialogue modeling, the second approach 
seems to be more relevant because agents must be able to produce arguments support- 
ing their propositions. 



4.1 Formulation 

An argumentation system essentially includes a logical language, a definition of the 
argument concept, a definition of the attack relation between arguments and finally a 
definition of acceptability. Several definitions were also proposed for the argument 
concept. In our model, we adopt the following definitions of [15]. Here /’indicates a 
possibly inconsistent knowledge base with no deductive closure, | — Stands for classi- 
cal inference and =for logical equivalence. 
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Definition 11: An argument is a pair (H, h) where h is a formula ofL and H a sub-set 
of r such that : i) H is consistent, ii) H | — h and Hi) H is minimal, so no subset of H 
satisfying both i and ii exists. H is called the support of the argument and h its con- 
clusion. 

Definition 12: Let (H } , hf), (//„ h 2 ) be two arguments. 

(Hj, hf attack (H2, h2) iff 3 heH2 such that h = ->/; j. In other words, an argument 
is attacked if and only if there exists an argument for the negation of an element of its 
support. 

We can now define the concept of acceptability [14]: 

Definition 13: An argument (H, h) is acceptable for a set S of arguments iff for any 
argument (H\ h’): if(H’, h’) attacks (H, h) then (H\ h’) is attacked by S. 

Intuitively, an argument is acceptable if it is not attacked, if it defends itself against all 
its attackers, or if it is defended by an acceptable argument. 



4.2 Linking Commitments and Arguments 

Argumentation is based on the construction of arguments and counter-arguments 
(arguments attacking other arguments), the comparison of these various arguments 
and finally the selection of the arguments that are considered to be acceptable. In our 
approach, agents must reason on their own mental states in order to build arguments 
in favor of their future commitments, as well as on other agents’ commitments in 
order to be able to take position with regard to the contents of these commitments. 
The systems proposed in the literature, for example in [ 14] and [37], do not take into 
account the arguments which can support actions on commitments. It is these argu- 
ments which we define in this section. 

In fact, before committing to some fact h being true (i.e. before creating a com- 
mitment whose content is h), the speaker agent must use its argumentation system to 
build an argument (H, h). On the other side, the addressee agent must use its own 
argumentation system to select the answer it will give (i.e. to decide about the appro- 
priate manipulation of the content of an existing commitment). For example, an agent 
Agj accepts the commitment content /? proposed by another agent Ag 2 if its argumen- 
tation system is compatible with h. i.e. if it is able to build an argument which sup- 
ports this content from its knowledge base. If Ag } has an argument (H\ —ih), then it 
refuses the commitment content proposed by Ag 0 . Now, if Agj has an argument nei- 
ther for h, nor for —h, then it must ask for an explanation. Surely, an argumentation 
system is essential to help agents act on commitments and their contents. However, 
reasoning on other mental and social attitudes (beliefs, intentions, conventions, etc.) 
should be taken into account in order to explain the agents’ decisions in a broader 
context than the agents interactions [27]. We do not address this issue in this paper. 

Thus, we claim that an agent should always use its argumentation system before 
creating a new commitment or positioning itself on an existing commitment and on its 
content. Consequently, an argument of an agent Ag 2 must support an action performed 
by this agent on a given commitment and/or on its content. Formally we denote: 
Definition 14: Arg(Ag k , H, Act(Ag k , t u , SC(id, Ag p A gj , t sc , S, S conlenu , (p, t (f ))) 
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Definition 15: Arg(Ag k , H, Act-content(Ag k , t u , SC(id, Ag p Ag f t sc , S, S contmu , (p, t^)) 
such that H being the support of the argument and the agent identifiers i, j, k verify: i, 
j, ke/1, 2], if and (k=i or k=j). In the first formula, H is the support of the action Act 
performed by agent Ag k on commitment SC. In the second formula, H is the support 
of the action Act-content performed by agent Ag k . Act-content is an action on the 
content of the commitment SC. 

The relation between H and the commitment content <p is defined according to the 
value of Act and Act-content. For instance, for an absolute or a conditional commit- 
ment we have: 

Acte (Create, Discharge] =>H | — <p 

I.e. if Act takes the value “Create” or “Fulfill”, then H defends (p. In the same way: 

Acte { Withdraw }=>H | — —i(p 

Act-contenente {Accept, Change, Defend] =>h\ — ( p 

Act-content e {Refuse }=>H | itp 

An agent can question a commitment content (p if it has an argument neither for (p nor 
for —itp. Formally we have: 

^ H such that H | — (p or H | ; (p 

For the other types of commitments, this relation is detailed in [3], 

A speech act can lead to an action not only on a commitment as explained in Sec- 
tion 3, but also on an argument. An agent can thus accept, refuse, defend or attack an 
argument. Thus we have: 

Definition 16: SA(i,, Ag k Ag f t u , U)\— dif 

Act-arg(A gi , t u , Arg(Ag k , H, Act(Ag m , t u , SC(id, Ag x , Ag y , t sc , S, S contenP (p, t^)) 
Definition 17: SA(i f Ag f Ag f t u , U) \~ d - f 

Act-arg(Ag f t u , Arg(Ag k , H, Act-content] Ag m , t u , SC(id, Ag x , Ag y , t sc , S, S contmt , cp, tj)) 
where : Act-arg e {Accept, Refuse, Defend or Attack}, i, j, k, m, x, y e { 1, 2} and Dj, 
x^y, (k, m=i or k, m=j). 



5 Using the CAN Formalism for Conversation Representation 

So far, we presented our framework of commitments and of the relations between 
these commitments and arguments. Indeed, our goal is to represent speech acts in a 
single approach based on commitments and arguments. This approach aims at offer- 
ing software agents a flexible means to interact in a coherent way. Thus, agents can 
participate in conversations by manipulating commitments and by producing argu- 
ments. It is the agents’ responsibility (and not the designers’ role ) to choose, in an 
autonomous way, the actions to be performed by using their argumentation systems. 

In this section, we show how a conversation can be modeled using the CAN for- 
malism on the basis of this framework. In a conversational activity, agents manage 
commitments and arguments whose chaining must be coherent. Our purpose is to 
present the dynamics of conversations using our formalism. This representation al- 
lows us to ensure conversational consistency in terms of the actions performed by the 
agents on the commitments and arguments. Indeed, this formalism has two objectives: 
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it can be used to analyze conversations, as well as a means to allow agents to take part 
in coherent conversations. 

5.1 Formal Definition of a CAN 

A commitment and argument network is a mathematical structure which we define 
formally as follows: 

Definition 18: A commitment and argument network is a 15-uple: <A, E, SC 0 , I, £2, Z 
0 A, n, a, (X S, 0, y, rj> where: 

• A: a finite and nonempty set of participants. A = {Ag p ..., Ag n j 

• E: a finite and nonempty set of social commitments. These commitments can be 
absolute commitments (C), conditional commitments (CC) or commitment attempts 
(CT) . E={SC () , ..., SCJ. 

• SC 0 : a distinguished element of E: the initial commitment. This element allows us 
to define the subject of a conversation. 

• I : a finite and nonempty set of speech act indices (or identifiers) which can be 
related to the creation and the positioning actions and to the argumentation rela- 
tions and to the connection relations. I={i 0 , ■ ■■, i n j. 

• £2: a finite and nonempty set of both creation actions of elements ofE and position- 
ing actions on elements of E, of £2 X I and of Z X I. £2= {Create, Accept, Accept 
conditionally, Refuse, Question, Change, Withdraw, Satisfy } 

• Z: a finite and possibly empty set of argumentation relations. Z= {Defend, Attack}. 

• 0: a finite and possibly empty set of connection relations that can exist between 
elements of E or between elements of E and elements of Z xl. 0=/Satisfy, Not sat- 
isfy, Contradict, Explain, etc. I 

• A: a partial function relating a commitment to another commitment using one 
argumentation relation characterized by an identifier i of I. 

A: E xE ->£xl 

• IT: a partial function relating a commitment to a pair made up of an argumenta- 
tion relation and an element of I using one argumentation relation ( characterized 
by an identifier i ofl ). 

IT: E xZxI->ZxI 

• a: a partial function relating an agent (a participant) to a commitment using a set 
of pairs made up of a creation or a positioning action and an element ofl. 

a: A xE^2 nxI 

• f: a partial function relating an agent to an argumentation relation (characterized 
by an identifier i of I) using a set of pairs made up of a creation or positioning ac- 
tion and of an element of I. 

f: A xZ xl^2 a{chmse}x ' 

• 5: a partial function relating an agent to a creation or a positioning action (char- 
acterized by an identifier i of I) using a set of pairs made up of a positioning action 
and an element ofl 

^ £2 Y J ^ { Create ’ Withdraw, Change} Xl 
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• 6: a partial function relating a commitment to a creation or a positioning action 
( characterized by an identifier i of I) using one argumentation relation. 

9: E x!2xl—>2fxl 

• y a partial function relating two commitments using a connection relation (char- 
acterized by an identifier i of I). 

y E xE ->@xl 

• 77 : a partial function relating a commitment to an argumentation relation using a 
connection relation ( characterized by an identifier i of I). 

tj: E x X XI—>0 xl 

Let us now comment upon these sets and functions. In a conversation, the sets A, 
E, £2, X 0 and I must be instantiated. For example, in a given conversation we can 
have: A=/Ag p Ag 2 j , E=/PC 0 , PC p PC 2 j , £2={ Create, Accept, Question!, 
E= {Defend } etc. 

The function A allows us to define the argumentation relation which can exist be- 
tween two commitment contents, i.e. a defense or an attack relation. For example: 

A(SC t , SCj) = (Defend, i k ). 

This means that the content of the commitment SCf (called source of the defense rela- 
tion) defends the content of the commitment SC- (called target of the defense rela- 
tion). The index i k associated with the defense relation is the identifier of the speech 
act whose performance gives rise to this relation. Associating such an index with 
argumentation relations and with various actions allows us to distinguish a relation 
from another and an action from another of the same type. 

The function 77 allows us to define an argumentation relation on another argumen- 
tation relation. For example: 

me, Defend, i k ) = (Attack, if). 

This relation points out that the content of the commitment SC t attacks a defense rela- 
tion characterized by the index i k . This defense relation is defined using the function 
A. The attack relation defined by the function 77 is characterized by the index ij. 

The function a allows us to define a set of creation and positioning actions (accep- 
tance, refusal, etc.) performed by an agent on a commitment content. For example: 

a(Ag p SC f)= {(Accept, i k ) } 

This reflects the acceptance of the content related to the commitment SC P This accep- 
tance relation is characterized by the index i k . Agj belongs to the debtors set associ- 
ated with this commitment. 

The function p allows an agent to take position by accepting, accepting condition- 
ally or refusing an argumentation relation. For instance: 

P(Ag p Defend, i k )= {(Refuse, if)} 

This means that the agent Ag 1 refuses the defense relation which is defined by the 
function A and characterized by the index i k . The refusal relation is characterized by 
the index ij 
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The function S allows an agent to position itself relative to a positioning action 
characterized by an index i by accepting it, accepting it conditionally, refusing it or 
questioning it. The positioning action on which an agent can take positions can be 
defined by the function a or the function / 3 . For instance: 

S(Ag p Refuse, i k )= {(Question, if)} 

This example shows the case in which agent Ag k questions a refusal action character- 
ized by index i k . The question action is characterized by the index ij. 

The function 6 allows us to define an argumentation relation binding a commit- 
ment SCf to a creation or a positioning action. The action is defined by the function a. 
For example: 

6(SCj, Refuse, i k )=(Defend, if 

This example highlights the case in which the content of the commitment SC, defends 
the refusal action characterized by the index i k . The refusal action is defined by the 
function a . The index i, characterizes the defense action. 

The function fallows us to define the connection relation which can exist between 
the contents of two commitments. For example: 

JfSC^ SCj) = (Contradict, i k ). 

This translates the fact that the content of the commitment SCf contradicts the content 
of the commitment SC-. If p is the content of SCf, then the content of SCj is —p. This 
contradiction relation is characterized by the index i k . 

The function rj allows us to define a connection relation between a commitment 
and an argumentation relation. For instance: 

ij( SC jy Defend, i k ) = (Contradict, if. 

This relation points out that the content of the commitment SC, contradicts the de- 
fense relation characterized by the index i k . The connection relation thus defined is 
characterized by the index 



5.2 Example 

In this section, we show how to represent a dialogue using the CAN formalism. We 
use the conceptual graphs notation (CG) proposed by Sowa [34] in order to describe 
the propositional contents of commitments. Conceptual graphs are a system of logic 
and a knowledge representation language consisting of concepts and relations be- 
tween these concepts. They are labeled graphs in which concept nodes are connected 
by relation nodes. With their direct mapping to natural language, CG serve as an in- 
termediate language for translating computer-oriented formalisms to and from natural 
languages. A concept is represented by a type (ex. PERSON) and a referent (ex. john) 
and denoted [TYPE: Referent] (ex. [PERSON: John]). A conceptual relation links two 
concepts and is represented between brackets. When representing natural language 
sentences, case-relations are usually used. Examples are: AGNT (agent), PTNT (pa- 
tient), OBJ (object), CHRC (characteristic), PTIM (point in time). The advantage of 
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CG over predicate calculus is that they can be used to represent the literal meaning of 
utterances without ambiguities and in a logically precise form. 

Let us consider the following dialogue Dl\ 

( Example 2: dialogue Dl) 

SA(i 0 , A gl , {Ag 2 }, t u0 , U 0 ): The disease M is not genetic. 

SA(i,, Ag 2 , {Agj}, t ul , Uj ): Why? 

SA(i„ Ag 1? { Ag,}, t u2 , U, ): Because it does not appear at birth. 

SA(i 3 , Ag 2 , { A gl }, t u3 , U 3 ): A disease which does not appear at birth can be genetic 
as well. 

SA(i 4 , Agj, {Ag 2 }, t u4 , U 4 ): How? 

SA(i 5 , Ag 2 , {Ag 3 }, t u5 , U 5 ): It can be due to a genetic anomaly in the DNA appearing 
at a certain age. 

SA(i 6 , A gl , {Ag 2 }, t u6 , U 6 ): It is true, you are right. 

By its speech act identified by i 0 , agent Agj creates, as explained in Section 3, a pro- 
positional commitment, i.e.: 

SA(i(f Agj, ( Ag 2 j , t u0 , U 0 ) | 

Create) Agj, t u0 , PCJidg, Agj, {Ag 2 ), t pc0 , (active), (submitted), p 0 , t p0 )) 
where PC 0 is the initial commitment of the dialogue, t pc0 = t p0 and p 0 is the proposi- 
tional content which can be described by the following CG: 

—.[[DISEASE : M]-4(CHRC)-a[GENETIC]] 

In the CAN formalism this speech act results in the function: 

Ct(Agj, PC 0 )=f(Create, i 0 )J 

Thereafter, agent Ag 2 performs the speech act identified by i s and takes position on 
the content of PC 0 by questioning it. Thus, "questioned" becomes the current state of 
PC 0 . Hence, we have: 

SA(ij, Ag 2 , (Agj, t ul , Uj) | 

Question(Ag 2 , t u0 , PC 0 (id 0 , Agj, {Agj, t pc(f (active), (submitted, questioned), pg, 

t p0 )) 

In the CAN formalism this speech act results in the function: 

OjAg 2 , PC 0 )={(Question, ij)j 

Then, agent Agj defends the propositional content p 0 of its commitment PC 0 by per- 
forming the speech act identified by i 2 . Hence, it creates another commitment PCj 
whose content is p 2 . Thus, "justified" becomes the current state of PC 0 . We have: 
SA(i 2 ,Agj, (Agj, t u2 , Uj\ 

Defend(Agj, t u2 , PC 0 (id 0 , Agj, (Agj, t pc0 , (active), (submitted, questioned, justified), 

Po t p o » 

/ \Create(Agj , t u2 , PCj id j, Agj, (Agj, t pcl , (inform, null, null), (active), ( submitted ), 

Pi’ t p i)) 

where t pc] = t pl and pj is described by the following CG: 
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-.[[DISEASE : M]<-(AGNT)<-[APPEAR]-KPTIM)-»[BIRTH]] 

In argumentation terms, agent Agj presents its argument (p p p 0 ) (see Section 4). 
Thus, we have: 

Arg(Agj, p p Defend(Agp t u0 , PC 0 (id 0 , Ag p /Ag 2 J, t pc0 , (active), (submitted, ques- 
tioned, justified), p 0 , tpo)) 

This is represented in the CAN formalism by the functions: 

of Agp PC j)={(Create, i-,)j, A( PC p PC 0 )=(Defend, i 2 ) 

By the speech act identified by i 3 , agent Ag 2 refuses the Agf s argument. Then, it 
creates a new commitment PC 2 whose content is p 2 . We have: 

SA(i 3 , Ag 2 , /Agjj, t u3 , U 3 ) | — d - f 

Refuse(Ag2, t u3 Arg(Ag p p p Defend(Ag p t u0 , PCfidg, Ag p (Ag 2 j, t pc0 , (active), 
(submitted, questioned, justified), p 0 , t p0 ))) 

a Create(Ag 2 , t u3 , PC 2 (id 2 , Ag 2 , { Agj], t pc2 , (active), (submitted), p-,, t p2 ) ) 
where t pc2 = t p2 and the content p 2 is described by the following CG ' : 

—.[—,[[ DISEASE : *x]<—(AGNT)<— [APPEAR] —HPTIM)—>BIRTH] ] 
a [ [ *x] — KCHRC)— > [ GENETIC] ] ] . 

This is represented in the CAN formalism by the functions: 

/3(Agi, Defend, i 2 )={(Refuse, i 3 )}, ojAg-,, PC 2 )={(Create, i 3 )f 
Agent Agf s speech act identified by i 4 questions the content of the commitment PC-,. 
This allows us to transfer the content to the “questioned” state: 

SA(i 4 , Agp (Ag 2 j, t u4 , U 4 )|— d - f 

Question(Agj , t u4 , PC 2 (id 2 , Ag 2 , /Agj/, t pc2 , (active), (submitted, questioned), p 2 , * P 2» 
In the CAN formalism, this results in the function: 

CfAgp PCf=((Question, i 4 )j 

Then, agent Ag 2 defends the content of its commitment PC-, by performing the speech 
act identified by i 5 . It then creates another commitment PC 3 whose content is p 3 . 
Thus, “Justified” becomes the current state of PC-,. We have: 

SA(i 5 ,Ag 2 , fAgJ, t u5 , U 5 )\~ d(f 

Defend(Ag 2 , t u5 , PC 2 (id 2 , Ag 2 , {Agj, t pc2 , (active), (submitted, questioned, justi- 
fied), p 2 , t p2 » 

a Create(Ag 2 , t u5 , PC 3 (id 3 , Ag 2 , (Agj), t pc3 , (active), (submitted), p 3 , t p3 ) ) 
where t pci = t p3 and the content p 3 is described by the following CG: 

[ANOMALY-DNA : *x]- 



1 To get this graph, we use the rule: 

p=>q = — .(pA— .q), with p = —.("there is a disease that appears at birth") and q = —.("this dis- 
ease is genetic"). 

Note that in the formula *x is a mark of coreference which appears in the referent part of a 
concept. 
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(AGNT)<— [CAUSE]— KPTNT)—>[DISEASE : y] 
[*x]<-(AGNT)<-[APPEAR]- 4(PTIM)-»[AGE : @certain] 

In argumentation terms, agent Ag 2 presents its argument (p 3 , p ,). Thus, we have: 
Arg(Ag 2 , p 3 , Defend(Ag 2 , t u5 , PC 2 (id 2 , Ag 2 , jAgJ, t pc2 , (active), 

(submitted, questioned, justified), p 7 , f P 2 )) 

In the CAN formalism, this results in the following functions: 
cfAg 2 , PC 3 )=((Create, i 5 )j, A( SC 3 , PC 2 )=(Defend, i 5 ) 

Agent Agf s speech act identified by i 6 reflects the Ag 2 ’s acceptance of both PC,’ s 
content and the argument defending it. Thus, “Accepted” is the final state of p 3 . We 
have: 

SA(i & Agj, I Ag 2 1, t u6 , U 6 )|— 

def 

Accept) Agl , t u6 Arg(Ag 2 , p 3 , Defend(Ag 2 , t u5 , PC 2 (id 2 , Ag 2 , (Agf, t pc2 „ (active), 
(submitted, questioned, justified), p 2 , * P 2») 

AAccept(Ag 2 , t u6 , PC 3 (id 3 , Ag 2 , {Agf, t pc3 , (active), (submitted, accepted), p 3 , t p3 )) 
In the CAN formalism, this is represented by the functions: 

P(Ag p Defend, i 5 )=((Accept, i 6 )}, ojAg p PC 3 )=f( Accept, i 6 )j 
To summarize, the dialogue D1 can be represented by the following CAN: <A, E, 
PC 0 , 1, Q, £, 0, A, IJ, a, f, 5, 0, y, rj> such that: 

A={Agj, Ag 2 ], 

E={PC 0 , PC,,PC 2 , PC 3 }, 

Q={ Create, Question, Refuse, Accept, }, 

I={ Defend}, 

0 = 0 , 

I={i 0 , 

a(Agj, PC 0 )={ (Create, i 0 )}, a(Ag 2 , PC 0 )={ (Question, iQ} 
a(Agj, PC,)={(Create, i,)}, A( SC,, PC 0 )=(Defend, i 2 ) 

[3(Ag 2 , Defend. i 2 )={(Refuse, i 3 ) } , a(Ag 2 , PC 2 )={ (Create, i 3 )} 
a(Agj, PC 2 )={ (Question, i 4 )}, a(Ag 2 , PC 3 )={ (Create, i 5 )} 

A( SC 3 , PC 2 )=(Defend. i 5 ), a(Ag p PC 3 )={ (Accept, i 6 )} 

P(Ag r Defend. i 5 )={(Accept, i 6 )} 

Fig. 2 shows the graphical representation of the network. 



5.3 CAN: A Means of Inter-agent Communication 

So far, we have shown how the CAN formalism enables us to illustrate the connect- 
edness of speech acts performed by the agents in a conversation. In the example of the 
previous section, we started from a pre-established dialogue, we examined it and we 
modeled it using a CAN. This highlights a process that enables us to analyze a con- 
versation using the CAN formalism. But the formalism also offers a means that en- 
ables agents to take part in consistent conversations. 
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Fig. 2. The network associated with the dialogue D1 

Agents can jointly build the network that represents their conversation as it pro- 
gresses. This allows the agents: 

1. To make sure at any time that the conversation is consistent; 

2. To determine which speech act to perform on the basis of the current state of the 
conversation, and using an argumentation system and other cognitive elements. 

Consistency is ensured by the relationships existing between different commit- 
ments, different argumentation relations and different actions (creation, acceptance, 
fulfillment, etc.). A speech act is consistent with the rest of the conversation if it leads 
to the creation of a new commitment related to another commitment through a con- 
nection or an argumentation relation, or if it makes it possible to take position on a 
commitment, on an argumentation relation or on an action. Moreover, the agent must 
know every thing about the current state of the conversation in order to determine its 
next speech act. For example, when an agent creates a commitment and/or an argu- 
mentation relation, one of the other agents may decide to act on what has been created 
by accepting it, by refusing it or by questioning it, depending on its argumentation 
system. Similarly, when an agent finds that its commitment, argument or action is 
being questioned, it must create a commitment in order to defend it. The network is 
built as the conversation progresses. This process differs from the one used to analyze 
a conversation. Therefore, agents use a dynamic process in order to build the network 
while taking part in the conversation. 

To illustrate this way of using the CAN formalism, we take the example of Sec- 
tion 5.2 and demonstrate how agents build the final network piece by piece. By doing 
that, agents are able to continue conversing. 

Agent Ag i decides to start a conversation (a dialogue) with another agent Ag 2 about 
a particular topic p 0 that interests it (the underlying mechanism related to this choice 
belongs to the cognitive layer and thus is not considered here (Figure 1)). Hence, Ag, 
creates a propositional commitment whose content is p 0 , i.e. : 

crfAgp PC 0 )=/(Create, i 0 )) 

This corresponds to the speech act identified by i 0 . 

Then, agent Ag 2 decides to take position on the content of PC 0 by questioning it since 
it does not have any argument in favor or against it. As a matter of fact, Ag-, wants to 
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know which Agfs argument supports the content of PC 0 . Therefore, Ag 2 performs the 
action corresponding to the speech act identified by if 

OfAg 2 , PC 0 )=j(Question, if)} 

Now, Agj must defend its proposition: it creates the commitment PC t whose content 
defends the content of PC 0 . In doing so, this agent performs the action corresponding 
to the speech act identified by 

cfAg p PCj)={(Create, if}, A( PC p PC 0 )=(Defend, i 2 ) 

Ag 2 has an argument against the defense relation. It refuses it by creating the com- 
mitment PC 2 - It performs the action corresponding to the speech act identified by i f 

/3(Agi, Defend , if=f (Refuse, if}, cfAg^, PC 2 )={ (Create, if} 

Agent Agj questions the content of PC 2 using its argumentation system. By doing 
that, it performs the action corresponding to the speech act identified by i 4 : 

OtfAgp PCf=/(Question, i 4 )j 

The content of Agf s commitment PC 2 being questioned. The agent must try to defend 
it. Thus, it creates the commitment PC 3 and performs the actions corresponding to the 
speech act identified by i f 

ofAg PC f=f (Create, i 5 )j, A( SC 3 , PC 2 )=(Defend, i 5 ) 

Thereafter, agent Ag ; accepts the content of PC 3 and the argumentation relation (De- 
fend, i 5 ) that are compatible with its argumentation system. It performs the actions 
corresponding to the speech act identified by if 

/3(Agj, Defend, i 5 )=((Accept, i 6 )}, ofAgj, PCf=f(Accept, i 6 )j 



6 Conclusion and Future Work 

In this paper, we proposed a formalism to represent the dynamics of conversations. 
The formalism offers an external representation of the conversational activity. In 
essence, the formalism has two purposes: on the one hand it helps to analyze conver- 
sations, and on the other hand it is a means of helping agents to take part in consistent 
conversations. This formalism uses an approach based on commitments and argu- 
ments to model conversations between autonomous agents. Using this approach, we 
can capture both the social and public aspects of conversations as well as the reason- 
ing aspect. We also proposed a communication model and an architecture for conver- 
sational agents that is compatible with this approach and this formalism. 

As an extension to our work, we intend to prove mathematically the existence of 
one and only one CAN to represent a given coherent conversation (proof of unique- 
ness) by using a formal way of representing dialogues proposed by [19]. We also 
intend to use the formalism in order to represent different types of dialogues, for ex- 
ample according to the classification of [38]. On the other hand, we intend to integrate 
our formalism in dialogue games to provide more flexibility to agent communication. 
Another key issue for future work is to define a formal semantics for our formalism. 
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We investigate the idea of using CTL* and dynamic logic to develop a unified seman- 
tics for commitments, arguments and existing relations between them. Finally, we 
want use model checking techniques to prove the validity of this semantics. 
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Abstract. Commitments are a powerful representation for modeling multiagent 
interactions. Previous approaches have considered the semantics of commitments 
and how to check compliance with them. However, these approaches do not cap- 
ture some of the subtleties that arise in real-life applications, e.g., e-commerce, 
where contracts and institutions have implicit temporal references. The present 
paper develops a rich representation for the temporal content of commitments. 
This enables us to capture realistic contracts and institutions rigorously, and avoid 
subtle ambiguities. Consequently, this approach enables us to reason about 
whether and when exactly a commitment is satisfied or breached and whether 
it is or ever becomes unenforceable. 



1 Introduction and Objectives 

Protocols help streamline the complex interactions that can take place between au- 
tonomous, heterogeneous agents in a multiagent system. A special application setting 
is e-commerce, where the agents represent different parties that do business on-line. 

The role of commitments in modeling such rich interactions is widely recognized, 
because they enable the key content of an interaction to be represented and reasoned 
about, especially in the face of opportunities or exceptions. Commitments are thus more 
expressive than traditional formalisms such as finite state machines. Yolum and Singh 
[1] show how to build and execute commitment-based protocols and to reason about 
such protocols in the event calculus. Fornara and Colombetti [2] capture key aspects of 
the commitment lifecycle and further advance the idea of commitments as a data struc- 
ture. Some compliance aspects of commitment protocols in a branching-time temporal 
logic with potential causality have also been studied by Venkataraman and Singh [3]. 

Motivation. The above approaches show that commitments provide a viable represen- 
tational framework for designing, executing, and validating flexible protocols in multi- 
agent systems. However, current approaches take a limited view of the temporal aspects 
of commitments. This can prove to be a drawback for their use in real systems, since 
business deals usually involve many clauses and have subtle time periods of reference. 
The following is an informal list of some properties that are relevant in practice, but not 
naturally handled by current approaches. 
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- Time Intervals. Contracts often involve time bounds, which simplify decisions 
about the satisfaction or breach of commitments, which is one of the reasons tradi- 
tional representations (e.g., paper documents) rely on them. Practical commitments 
often must be satisfied either within a fixed, bounded interval or at a specified in- 
stant in the future. 

- Maintenance. Current work on commitments has concentrated on achievement con- 
ditions, whereas real-life commitments are as likely to be about the maintenance of 
certain conditions. For example, a typical service-level agreement (SLA) may in- 
volve committing to maintaining network connectivity during business hours. 

- Temporal anaphora. A particular variety of time bounds arise in the notion of tem- 
poral anaphora, as introduced by Partee [4]. A promise such as “I will send you 
the goods” or a claim such as “I tried to call you five times” involves an implicit 
range of salient times within which the specified action occurred or will occur. 
Although we are not concerned here with commonsense reasoning, our represen- 
tational framework for commitments should be able to accommodate the results of 
such reasoning. 

Point-based temporal logics, which are commonly used in distributed system specifi- 
cations, are inadequate to express the above requirements. Accordingly, we develop 
an extension of the well-known branching-time logic. Computation Tree Logic (CTL) 
developed by Emerson [5], which can capture the cases of interest here. 

Challenges. We use examples from situations that arise in practical applications of web 
services to motivate our study of temporal aspects of commitments. We consider the 
example of a travel agent, who wishes to book an airline ticket to a certain destination, 
a rental car to use while there, and also a hotel room to stay at. 

Example 1. The travel agent wants the passenger to be able to fly on one particular day, 
reserving the right to choose any flight on that day. If the airline is willing to offer such 
a deal, it becomes committed to maintaining a condition - a booked ticket - over an 
extended period of time. We need to be able to specify such maintenance conditions in 
commitments. 

Example 2. The car rental company might offer, for some reason, one weeks free rental 
in the month of January. This is a maintenance condition within another time period. 
We need to be able to capture such temporal intricacies without bloating the domain 
language. 

Example 3. Some commitments may violate constraints about time that commonsense 
reasoning would have detected. Such a situation can arise, for example, when a hotel 
offers an electronic discount coupon that expires today, but the coupon can be used only 
in some future time period, say, a special spring break offer that expires much before 
spring break. 

Example 4. Another interesting example is when a warranty cannot be verified within 
the period over which the warranty is valid. Consider a customer who rents a car from a 
company which guarantees that the car will not break down for at least a two days, and 
promises a replacement car if it does. However, if the car were rented on a Friday and 
the company is closed on the weekends, then the customer is at a disadvantage. 
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Example 1 is solved in Section 4.1, Examples 2 and 3 in Section 4.2, and Example 4 
in Section 4.3. 

Contribution. Our main contribution is in applying a richer temporal representation 
to commitments and showing how the satisfaction or breach of a commitment can be 
detected. Further, the temporal aspects of commitments are independent of the domain- 
specific semantics of the condition that the commitment is about, so that we can reason 
about the temporal aspects of commitments in a domain-independent manner. 
Organization. Section 2 introduces background concepts, Section 3 develops our tech- 
nical approach. Sections 4 explains our results on the resolution of commitments that 
use temporally qualified propositions, and Section 5 summarizes our proposal and iden- 
tifies directions of further research. 

2 Background: Time and Commitments 

We next briefly explain our model of time and our temporal logic, and introduce the 
notion of commitments. 

2.1 The Temporal Framework 

We use a discrete, branching-time model, as shown in Figure 1. The temporal model 
has the following features: 

F\ The world is a set of discrete moments in time. M is the set of all possible moments, 
partially ordered by A. The past is linear, and the future branches. 

F 2 Each moment m is given a timestamp r(m) £ T, totally ordered by <. If mo A 
m i then r(m o) < t(toi). 

F$ A scenario S' at a moment is a maximal set of moments containing the given mo- 
ment and all moments along some branch in the future of the given moment. S m 
denotes the set of all scenarios at a moment m. A scenario S £ S m has the follow- 
ing properties: 

-S is rooted ; i.e., m € S. 

-S is linear; i.e., (Vmi, m 2 £ S : (mi = m 2 or mi A m 2 or m 2 -< mi) and 
m A mi). 

For example, in Figure 1, the path momim .5 . . . £ S mo 

We use an extension of Emerson’s Computational Tree Logic (CTL) [5], We now 
introduce the components of CTL. 

1. Booleans. - and V carry their usual meaning, true, false, — and A are obvious 
abbreviations. 

2. Linear time. These operators apply over a particular scenario. 

U: A proposition pllq, read p until q, is true at a moment m, on a scenario, iff q 
holds at some moment m x in the future on the given scenario and p holds at all 
moments between m; and m x . 

F: A proposition F p, read eventually p, means that p holds at some point in the 
future in the given scenario. F p abbreviates truellp. 

G : A proposition G p, read always p. means that p always holds in the future on the 
given scenario. G p abbreviates ->F-ip 

3. Branching quantifiers. The operator A denotes in all scenarios at the present mo- 
ment. 
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Fig. 1 . A schematic representation of our model of time 



2.2 Temporal Qualification 

The temporal commitment structure specified by Fornara and Colombetti [2] forms the 
basis for our temporal commitment scheme. Every condition specified in the commit- 
ment language has a time interval, a temporal quantifier, and a proposition in the domain 
language. 

We use timestamps to denote endpoints of time intervals. Two timestamps di and d„ 
are used to represent an interval that begins at di and ends at d u , both instants inclusive. 
For any such time interval, di < d„ ■ We introduce the following temporal quantifiers to 
quantify over instants in the interval: 

1 . [ ] is an existential quantifier over a time interval, [di , dfiji means the proposition p 
has to hold at one or more instants in the interval beginning at d\ and ending at d-j. 

2. [ ] is a universal quantifier over a time interval. That is, [d \ , dfip means the propo- 
sition p has to hold at every instant in the interval beginning at d-\ and ending at d- 2 - 

2.3 Commitments 

Commitments are obligations that one agent has towards another, as Castelfranchi de- 
scribes [6]. Formally, a commitment C( id, x, y, p), relates a debtor x, a creditor y. and a 
condition p in such a way that x becomes responsible to y for satisfying the condition p; 
the commitment has a unique identifier id. The commitment is said to be satisfied when 
the condition p holds. There can be at most one commitment with a particular identifier 
in our entire model. 

Commitment Operations. Commitments are created, satisfied, and transformed in cer- 
tain ways. According to Singh [7], the following operations can be performed on com- 
mitments. 
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1. CREATEfx, c) establishes the commitment c in the system. This can only be per- 
formed by c’s debtor x. 

2. CANCELfx, c) cancels the commitment c. This can only be performed by c’s debtor 
x. Generally, cancellation is compensated by making another commitment. 

3. RELEASEfy, c) releases c’s debtor x from commitment c. This only can be per- 
formed by the creditor y. 

4. ASSlGNfy, z, c) replaces y with z as c’s creditor. 

5. DELEG ATEfx, z, c) replaces x with z as the c’s debtor. 

6. DlSCHARGEfx,cJ c’s debtor x fulfills the commitment. 

We note that a commitment has to be created using the CREATE operation for it 
to exist. Further, we assume equivalence of the performance of a DISCHARGE opera- 
tion and the satisfaction of the condition p. That is, we assume that the DISCHARGE 
operation brings about p, and conversely, if p occurs, the DISCHARGE operation is as- 
sumed to have happened. This assumption does not impoverish the theory, and we defer 
a discussion on it to Section 5. 

We also introduce two predicates in Section 3.1 that help us capture the notion of 
fulfillment of a commitment rigorously. 

Commitment Predicates. For every operation on commitments listed above, we intro- 
duce a corresponding predicate which has the same name as the operation, but with 
lowercase letters. The predicates, instantiated with proper parameters, are true at the 
moment at which the corresponding operation is performed. For example, if an agent 
x performs a C RliATEfx, cj operation at a moment rrii, then the predicate create(x,c) is 
said to have the truth value true at the moment mj. 

Commitment Identifiers. Every commitment is assumed to have a unique identifier that 
helps to distinguish it from other commitments that may have the same debtor, the same 
creditor, and the same condition. For example, if I promise to pay you $5 twice, then a 
single payment of $5 should not suffice. The predicates in question also apply to specific 
commitments, i.e., respecting their unique identifiers. For example, I may cancel one of 
my two commitments to pay $5 without automatically canceling the other commitment. 
The identifiers come from a domain D, which can be thought of as formed of the natural 
numbers. 

3 Technical Framework 

This section introduces the concept of time intervals, describes the formal language for 
our scheme, and introduces two key predicates dealing with the resolution of commit- 
ments. 

3.1 The Formal Language 

The following is a grammar for our formal language, T, expressed in Backus-Naur 
Form (BNF). Here, tokens beginning with an uppercase letter denote nonterminals, to- 
kens beginning with a lowercase letter denote lexical items that are not analyzed by this 
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grammar, agent stands for any agent symbol, — ► and | are meta-symbols of the BNF, 
and all other symbols are terminals. T is the unique starting symbol for the language 
ofT. 

G\ T — * AExpr \ EExpr \ breachediC) \ satisfiediC) 

Gi Expr — » ExprUExpr \ Prop 

G3 Prop — * -1 Prop | Prop V Prop \ Tprop \ Atomic 

G4 Tprop — * [I, I] Prop \ [I, T] Prop 

G5 / — > date | variable \ date + duration \ variable + duration 
Gq Atomic — > Oper \ a 

G^Oper — > create{agent,C) \ cancel (agent, C) \ delegate(agent, agent, C) \ 
assign(agent, agent, C) \ release(agent , C) \ discharge(agent,C) 

Gg C — > C(identifier , agent, agent, Prop) 

In the grammar, a is an atomic proposition in the domain, identifier is a unique com- 
mitment identifier, date is a timestamp, date £ T, variable is a time variable that is 
bound to a timestamp , and duration is a length of time used to construct simple ad- 
ditive expressions with time variables. The terms generated by I in the grammatical 
rule G5 are called temporal terms. Temporal terms that have no variables are called 
ground temporal terms. 

As a convention, we use p and q to denote simple propositions and pt and q t to 
denote temporally qualified propositions. 

New Predicates for Resolving Commitments. We propose two predicates, breached(c) 
and satisfied(c), indicating violation and fulfillment of the given commitment, respec- 
tively. 

3.2 Semantics 

We now describe the semantics for the language T. For a proposition p, M \= m p 
means that a model M satisfies proposition p at moment m. M \ =s, m P means that 
the model M satisfies p at moment m in the scenario S. When resolving nested interval 
expressions, M \=s,m,mi,m u .E P means that M satisfies p at a moment m in the scenario 
S within the interval that begins at to/ and ends at m u , both mi and m u being in the 
scenario S. Two constants E and U are used as subscripts to denote whether the interval 
is to be interpreted as existentially or universally quantified, respectively. 

An interpretation I labels each moment with the atomic propositions and the ground 
commitment predicates that are true at that moment. Ground commitment predicates 
here refer to createf, •), assignf,-,-), delegate (•,•,•)’ cancelf,-), releasef, ■), and 
discharge (•, •). In apractical system, these elements would be specified in some manner 
external to the logic. For instance, a create operation might be taken to hold wherever a 
user submits a form over the Web. Let <P be a set containing the atomic propositions a 
and ground commitment predicates. Then I:Mn pifif). 

Let M = (A, M, A, I, T) be a model for the formal language T, where M, T, and 
A have the meaning as explained in Section 2 , A is a set of agent symbols, and I is an 
interpretation as defined above. 
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The semantics uses a substitution for time variables that occur in the bounds of 
intervals. If p is an expression, and a; is a vector of all time-variables in the expres- 
sion, then, p is the expression produced by a uniform, concurrent, and element-wise 
substitution of x by d. Such an expression will have only ground temporal terms. A 
ground temporal term is evaluated through date arithmetic. In the following, if t is a 
ground temporal term, then [f] is the timestamp corresponding to it. For example, if a 
temporal term t is bound to a timestamp corresponding to January 1 , then the temporal 
term t + 7 days will be bound to the ground temporal term January 8. The details of 
the arithmetic are not formalized here. 

The semantics of T is given next. Here, c refers to a commitment of the form 
C (id,j,k,p), dj s denote timestamps, and active(c ) is an abbreviation for ->( cancel 
(, j , c) V delegate(j, ■, c) V assign(k , •, c) V release(k, c ) V discharge (j, c)). 

The semantic rules Ri, R>, Ri, and R.\ give the meanings of the expressions gener- 
ated by the grammar rule G\ and the rules Rr, and Ri, give the meanings for the grammar 
rule Gj- Rule Re uses too to denote the earliest moment on a scenario S and oo as a 
place holder to denote the limit of the scenario. 

Ri M \= m A p iff (VS : S € § TO => M \=s, m P )• 

R2 M \= m E p iff (3S : S G § m => M \ =s,m p)- 

R3 M \= m satisfied(c) iff (37773 : m3 A to and to- G S and M |=s >m3 discharge (j, c) 
and (3 toi G S : TOi A to 3 and M ^s, mi create{j,c) and (Vto 2 G S : TOi A 
m 2 A to 3 =*• M \ =s,m 2 active (c)))) 

R 4 M \= m breached (c) iff (3 to 3 G S : m3 A to and to G S and M |=s >m3 

AG discharge^ ,c) and 3?ni G S : TOi A 777,3 and M \=s, mi create(j,c) and 

(Vto2 G S : mi A TO2 A 777, 3 M \ =s,m 2 active(c))). 

R5 M |=s j7n p\Jq iff (3 toi G S : to A toi and M ^s, mi q and (\/m 2 : to A 7772 A 

TOi M |=5, m 2 P))* 

Re M | =s,m P iff 3d : M \=s,m,m 0 ,oo,E p\d' w h ere p ^d>,x is a vector of variables 
the occur in p, d is a vector of timestamps, and |d| = \x\. 

The following are the semantics of the grammatical rules G3 and G4. Rules Ry 
through Rn apply to an existentially quantified interval while rules Ri 2 through Ru, 
apply to a universally quantified interval. 

Ry M \=s,m,m,, mu ,E p iff 3777^ G S : 777; A m x A m u and mi , m u G S andp G 

I (to x ). 

R s M \=s,m,m h m u , e ~>P iff 3 to x G S : mi A m x A m u and to; , m u G Sand 
M \/=m x P- 

R 9 M \=s,m,m,,m u ,E P V q iff 3 to , m y G S : to; A m x , m y A m u and mi, m u G 
S and M j= mx p orM \= my q 

Rio M |=S,m, mi ,m u .E [d'lid'ulP iff t (toJ) = d[ and r{m' u ) = d! u and to;, m u , m! t , m! u 
G S and 3 m x ,m' x : mi A m x A m u and mj A m' x A m' u and m' x A m x => 

| = S,m x P 

Ru M [d' v d'Jp iff t(to() = d[ and t(to(J = d' u and to; , m u , ml l , m! u 

G S and 3 m x ,\/m x : mi A m x A m u and m[ A to^, A m' u and m' x A m x => 
fhf ^ S,m x P 
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R V2 M P iff Vm x G S' : to; A m x A m u and m u m u £ S and p G 

i K). 

^13-^ hs,m,m,,m„,u -^piffVm x G S' : mi A ntj A m M and m. I ,m u G 5 and 
^m x P- 

Rii M \=s,m, mi ,m u , u pV §iffVm x G S : m; A m x A m u and mi,m u £ S and 
M \=m x P or M \= mx q 

R-L5 M \=s,m,mi,m u ,\j [d'i,d' u ]p iff t(toj) = d[ and r(m' u ) = d! u and to/, m u , m[, m! u 
G S and \/m x , 3 m' x : mi A m x A m u and m[ A m' x A m! u and m x A m x =A 
M Ns,m x P 

^16 M [d’ v d' u ]p iff t(toj) = d[ and r(m' u ) = d' u and to/, m u , m' t , m' u 

G S'andVm a; ,m^ : to; A A m u and mj A m' x A m' u and A => 
M \=S,m x P 

Further, we impose the following constraints on the model to capture operations on 
commitments. 

Ci A commitment cannot be created more than once with a given identifier. 

M \=s,m create(j, G(id,j, k,p)) => Vmi : m A mi =A (\/ji,ki,p : M ^s, mi 
create (ji , G(id,ji,ki,p))). 

C-2 When a commitment is assigned, it is no longer active, but a new commitment with 
the new creditor is created. 

M | =s,m (assign(k, z,c) => AG ^ active (c) A create(j, c')) 
where c = C (id,j,k,p) and d = C (id' ,j,z,p). 

C3 When a commitment is delegated, it is no longer active, but a new commitment 
with a different debtor is created. 

M 1 =s,m ( delegate (j, z, c ) =A AG^active(c) A create(z, c')) 
where c = C (id,j, k,p) and d = C {id' , z, k,p). 

C4 When a commitment is first created, it is active and not impossible to discharge, 
M | =s, m (create(c) =A active(c) A -^AG^discharge(c)) 

C 5 After a commitment has been breached, no operation can be performed on it. 

M \= m breached (c) => M \=s,m AGactive(c) 

3.3 Commitment Life Cycle 

Using the above semantics, we establish some simple but important lemmas indicating 
the stability of the breached (•) and s atisfiedf) predicates. 

When a commitment is first created, it is neither breached nor satisfied. Eventually, 
it might be breached and thus remain breached forever; or it may be be satisfied and 
remain satisfied forever. This has the effect of applying a three-valued logic for the 
satisfaction commitments. 

Lemma 1. M \ =s,m create(c) => M \= m (-^breached (c) A -> satis fied(c)). 

Proof. By applying constraint C4 to the semantic definitions R \ and R3. 

Lemma 2. M \= m breached(c) iff\/m x : m A m x => M \= mx breached(c). 

Proof. By semantic definition R4 . 
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Lemma 3. M \= m satisfied(c) iff\/m x : m A m x =>■ M \ =m x satisfied (c). 

Proof. By semantic definition R 3 . 

Lemma 4. M \= m ( ~>satisfied(c ) breached(c)). 

Proof. By Lemma 1. 

Lemma 5. M \= m {-^breached {c) satisfied (c)). 

Proof. By Lemma 1. 

The Lemmas 2 and 3 are shown in Figures 2 and 3 respectively. The moments 
mi, m 2 , and m 3 marked in the figures denote moments used in R 3 and R \ . Note that 
a commitment is active even after it is breached. However, a commitment cannot be 
active after it is satisfied. These observations follows from the definition of the active (•) 
predicate and the constraint C5. 

4 Resolving Temporal Commitments 

A temporal commitment is resolvable if its satisfaction or breach can be determined 
at some moment. Under certain conditions, the unresolvability of a temporal commit- 
ment can be ascertained even before the specified time interval occurs. We now discuss 
cases where a temporally quantified proposition is not resolvable, and develop meth- 
ods to detect such cases. Based on the resolvability of such propositions, we can detect 
satisfaction or breach of temporal commitments. 

4.1 Nested Interval Expressions 

The language given in section 3.1 allows for propositions to be nested within multiple 
levels of time intervals. Although there are many nested intervals whose interpretation 
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Fig. 3. Temporal behavior of the satisfied(-) predicate 



in common language does not make sense or induces redundancy, some nested time 
intervals do make sense in real-life situations. We give examples of both meaningful 
and meaningless nested interval propositions. 

Allen [8] defines 13 possible temporal relationships between any two given time 
intervals. Figure 4 shows these 13 relationships for the two intervals contained in the 
proposition [di,d 2 ]([d 3 , d 4 ]g); i.e., the intervals [di,d 2 ] and [d 3 ,d 4 ]. Here, time in- 
creases from left to right. The shaded portions are intervals of the type [d/, d u ], and the 
unshaded portions are of the type [di, d.„], 13 such cases can be constructed for each 
combination of temporal quantifiers applied to each of the intervals, but we show only 
one interval-quantifier combination pair as an example; i.e., the quantified intervals 
[di,d 2 ] and [d 3 ,d 4 ]. 

If we consider nested temporally quantified propositions as being conditions of 
commitments, we can see which kinds of nesting will make the success of the com- 
mitment unresolvable. Cases 1.1, 3.1, 4.1, 5.1, and 6. 1 are not resolvable, but cases 1.2, 
2.1, 3.2, 4.2, 5.2, 6.2, 7.1, and 7.2 can be resolved for reasons listed below. The term 
inner proposition is used to refer to the temporal proposition [d 3 , df\p. 

In cases 1.1, 3.1, 4.1, 5.1, and 6.1, the inner proposition’s time interval does not 
complete until after the outer time interval completes. The inner interval has references 
to instants in the future. Since the future cannot be seen in advance, these cases cannot 
be resolved. 

In cases 1.2, 2.1, 3.2, 4.2, 5.2, 6.2, 7.1, and 7.2 the inner proposition’s success can 
be resolved at at least one instant within the interval of the outer proposition. Since 
the outer proposition has an existentially quantified time interval [fi, f 2 ] and we have at 
least one instant of resolvability, these cases can be resolved. 

A temporally quantified proposition can be used to represent events like that in 
Example 1; i.e., the passenger using one ticket on a particular day, on any flight of her 
choice. 



176 



Ashok U. Mallya, Pinar Yolum, and Munindar P. Singh 




[d 1 ,d 2 ]m[d 3 ,d 4 ] 
| case 3.1 | 



[d,,d 2 ] > [d 3 ,d 4 ] 



Li 


ase 1 .2 | 




1 


i 


r~ 

a 3 

[d ,d 


n 1 

<*4 d 2 

di 

2 ] mi [d 3 ,d 4 ] 


| case 3.2 | 



r 

d, 



[ d r d 2 ]o[d 3 ,d ] 
| case 5.1 | 



[ d,,d 2 ] 0 i[d 3 ,d ] 
| case 5.2 | 




[d i,d 2 ] = [d 3< d 4 ] 



case 2.1 | 




— I 1 1 i r 1 

d, d 2 d 4 d, d 3 d^ d 2 

[d,,d 2 ]d[d'7dj [d 1 ,d 2 ]di [d~7dj 



| case 4.1 | | case 4.2 | 




[d ,.d 2 l s [d 3 ,d 4 ] 



| case 6.1 | 




I T 

d 3 d 4 



1 [d,,d 2 ]si[d 3 ,d 4 ] 

|~ case 6.2 | 



n 

d 2 




[d,,d 2 ]f[d 3 ,d 4 ] 
| case 7.1 | 




J, d 3 d 2 

d 4 

[ d ..dj fi [d 3 ,d ] 

| case 7.2 ~| 



Fig. 4. Allen’s intervals for [di, d 2 ]([d, 3 , df\p) 



Solution 1. If p represents the proposition that a ticket is on offer, then [d\, d 2 \p can be 
used to denote that a ticket will be on offer for the period of time between d\ and d 2 . 
Hence a ticket valid for an entire day would be represented by [d\ , di + 24hours]p. 

Nested temporal intervals can be used to denote maintenance conditions like the 
one in Example 2. 

Solution 2. If d\ denotes January 1 and t denotes a time variable, and p denotes that the 
company will rent a car for free, then the proposition [di, d\ + 31days][t, t + 7 days]p 
denotes one week of free rental in the month of January. 

Note that it only requires a simple extension of our language to be able to specify 
timestamps in relation to one another. 

We next formalize the notions of nested intervals and results about their resolution. 
These results can be used to detect some commitment violations before they occur. 

Definition 1. A temporally quantified proposition is positive-resolvable at an instant if 
its value is known to be true at that instant; it is negative-resolvable at an instant if its 
value is known to be false at that instant. 

Definition 2. A temporal commitmen t is positive-resolvable at an instan t if its satisfac- 
tion can be known at that instant; it is negative-resolvable at an instant if its breach can 
be known at that instant. 
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We use the following notation to denote some important instants with respect to an 
interval. Below, p t is a temporal proposition, and r denotes resolvability. Given an in- 
terval, 

r+(pt) represents the earliest instant at which p t is positive-resolvable; 

ry (pt) represent the latest instant in that interval at which p t is positive-resolvable. 

rj (p t ) represents the earliest instant at which p t is negative-resolvable; 

r T ( Pt ) represents the latest instant in that interval at which p t is negative-resolvable. 

The following observations form the base cases for detecting resolvability of propo- 
sitions that have intervals nested to any arbitrary depth and the resolvability of temporal 
commitments. Here, p is a proposition. 

r]_(\dud u ]p) = di, r^([di,d u ]p) = d u . 
r ±( [di,d u ]p ) = d u , ry( [di,d u \p ) = d u . 
r^( [di,d u \p ) = d u , r^ ([di,d u ] p) = d u . 
r^_(\di,d u ]p) = di, r^([di,d u ]p) = d u . 

These observations imply that p t is not positive-resolvable at any instant before r~\_ ( p t ), 
not negative-resolvable at any instant before rj(p t ), positive-resolvable at any instant 
after ( p t ), and negative-resolvable at any instant after ( p t ). 

4.2 Resolving Nested Interval Expressions 

Using the rules in Section 4.1, we can now see why some of the two-level interval 
nesting cases shown in Figure 4 were determined to be unresol vable. 

In cases 1.1, 3.1, 4.1, 5.1, and 6.1, the earliest instant at which the satisfaction of 
[d^,,di\q can be determined is d 4 , which is beyond d 2 , the latest instant for the sat- 
isfaction of [di,d 2 }p- As a consequence, the expression [<ii , <i 2 ] ( [<^ 3 , cannot be 
resolved, which is why commitments whose conditions are propositions of this type are 
disadvantageous for the creditor. 

Solution 3. To model Example 3, the hotel H makes a commitment to a customer c. The 
commitment is C {d, H, c, A[di, d\ + 24hrs]{[d 2 , d 2 + 7 days]q), where d\ + 24hrs < 
d 2 because it is not spring break yet, [d\,d\ + 24 hrs ] denotes the interval “today” 
(say, a day in July), [d 2 , d 2 + Idays] denotes the interval when spring break happens, 
and q is an atomic proposition that denotes some offer that the coupon offers. In this 
case, [d 2 ,d 2 + 7days]q cannot be resolved at least until d 2 + Idays , and [d\,d\ + 
24hrs ] (•) has to be resolved at most by d\ + 24hrs. But since d\ + 24hrs < d 2 + Idays , 
this condition cannot be resolved. Hence the commitment cannot satisfied. Formally, 
ry([di, d\ + 24 hrs\{-)) < r\{[d 2 ,d 2 + 7 days\q). 

To summarize, the following conditions are necessary to ensure resolvability of a 
temporally quantified proposition: 

A temporally quantified proposition of the form [di , d u ]pt must have at least one instant 
in the interval di, d u , at which p t is resolvable. A temporally quantified proposition of 
the form [di,d u \pt must havep t resolvable at all instants in the interval di,d u . 

For a commitment c, the following lemmas indicate the three valued logic of satis- 
faction due to the satisfied(-) and the breached (•) predicates. 
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Lemma 6. 

M \=s,m create (x,c) => ( \/m x : m A m x => 

(M \= mx satisfied(c) =$■ M \=s,m x AG -^active(c))) . 

Proof. By the definition of the predicate active(-) and the semantic rule R3. 

Lemma 7 . 

M \ =s,m create(x , c) => ( \/m x : m A m x => 

(M | = mx breached (c) => M \ =s,m x AGactive(c))) . 

Proof. By the constraint C5 and Lemma 2 . 

We have shown how unresolvable commitments can be detected. Such resolution re- 
sults will enable earlier detection of protocol violations, and are of practical importance 
where an unresolvable commitment is as good (or as bad) as one that is breached. 

4.3 Disjunctive Forms 

Another important aspect of resolution concerns disjunctive commitments whose con- 
ditions are disjunctions of temporally-quantified propositions. 

Disjunctive commitments regularly arise in common business interactions and can 
sometimes lead to what we call the warranty paradox — a situation where some subtle 
clauses render the warranty void before the customer can ascertain the quality of the 
good. This can happen, for instance, if ascertaining the quality of the good takes more 
time than the life of the warranty. 

Intuitively, we reason as follows about the satisfiability of a disjunction of temporal 
propositions: A disjunction of temporal propositions can potentially be satisfied if it has 
not already been satisfied, and at least one of the disjuncts is still resolvable. 

Let pi,P2, ■ ■ -Pn be temporally quantified propositions that occur in a disjunctive 
commitment of the form C (id, x, y , A((pi V P2 ■ ■ ■ V pf) V {pi+i V pi+ 2 V ... V p n )). 
Here, p\ V P2 ■ ■ ■ V pi represents some quality that the good satisfies, as claimed by the 
merchant x, and Pi+\ V Pi+2 V . . . V p n represents the replacement for the good that the 
merchant promises to the customer y. If the quality assured by the merchant becomes 
false at some moment, then the replacement proposition should be positive-resolvable 
at that moment. Otherwise, the warranty is unfavorable for the customer. 

Formally, this requirement is stated as 

M j=s, m ~'{pi V P2 ■ ■ ■ V pf) ( M j=s, m (pi+ 1 V p i+ 2 V ... V p n ) V 

(t(to) < r\{p i+ 1 V Pi +2 V ... V Pn) A 
(M \f=S,m ~'{pi+ 1 V Pi + 2 V ... V Pn))) 

If none of the warranty disjuncts are resolvable at an instant at which the claim 
about the quality is false, then the warranty is unfavorable to the customer. 

Applying this requirement to Example 4 of Section 1 gives us the following. 
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Solution 4. Example 4 can be modeled by the commitment C (id, R,c, A([di , d%] good 
_ car V [di, dz\replace-car)), where “good_car” means the the car hasn’t broken down, 
“ replace _car” represents the warranty that the rental company gives on the quality of 
the car, R represents the rental company, and c the customer. d\ represents the time 
at which the car is rented on Friday, g^ is the time at which the car should be re- 
turned on Monday, and d 3 denotes the closing of the rental company on Friday. Hence 
g ?3 < d- 2 - We see that there exists a moment in between d\ and d 3 , at which the lit- 
eral ([rfi, dz\replace-car ) is beyond the upper bound of its positive-resolution. If the 
car breaks down on Saturday, then the only proposition that has not yet been resolved 
at that moment is the guarantee by the renter to replace it. However, the upper bound 
of positive resolvability of this proposition has passed. Formally, 3m : di < d 3 < 
r(m) < g ?2 and M |=,s im -<([d\,d 2 ]great.car) and M ([di,d 3 ,]replace-car) 

and ([g?i, d^replace -car) < r(m). 

Figure 5 shows Example 4. 



r T ([^1-^3] "place 



cal') 



r(m) 




di d 3 d 

Fig. 5. Unfavorable warranty in Example 4 



Thus we have shown how the warranty paradox can be captured in our scheme of 
temporal commitments. 

5 Discussion 

The concept of deadlines in commitments is doubtless necessary for practical uses of 
commitments. Traditionally, deadlines are hidden within the atomic propositions. How- 
ever, an explicit formulation of temporal commitments, as developed above, is highly 
desirable. It offers a uniform treatment of operational characteristics across domains. 
We have shown how such a system of commitments with deadlines can be developed 
and used to reason about the possibility of satisfaction. Our approach not only allows 
for the expression of statements that involve deadlines, but also decouples the temporal 
quantification from the proposition itself, thus allowing us to reason about the temporal 
aspect without regard to the meaning of the propositions. We now discuss related issues. 
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In Section 2.3, we assumed that all commitments are fulfilled by the DISCHARGE 
operation, and not just by the condition p holding. This assumption helps simplify the 
theory because it excludes a situation where a commitment C(di , x, y , p) is satisfied by 
p being brought about by some agent 2 rather than by agent x. However, the assumption 
does not weaken the scope of the commitment operations. We may require that only 
agent x bring about p. The domain language used to specify p should be rich enough to 
incorporate such constraints. 

For instance, the condition p might be “ turn the lights on”, in which case any agent 
could turn the lights on to fulfill the commitment C(d\,x, y, p). In fact, we might want 
to allow such a fulfillment. On the other hand, if the condition were subjective, like 
“teach a graduate course”, we might want to make sure that only the intended party 
brings about the condition. We would therefore have to have a language for specifying 
p that can express that “x will teach a graduate course”. 

5.1 Literature 

Literature related to our work can be classified according to two main orthogonal fields: 

- The semantics of agent interaction protocols. 

- The semantics and implementation of business processes. 

A middle ground, where our work lies, is the operational aspects of commitments. Our 
work embodies desirable aspects of both the semantics and the business processes. 

Semantics. Dignum el al. [9] describe a temporal deontic logic that helps specify obli- 
gations and constraints. The work focuses on specifying deadlines, so that a planner can 
take deadlines into account while generating plans. Their approach, however, is based 
on the notion of obligations, and no operational methods for obligations are given. Once 
a deadline has passed, and a certain rule has been violated, the logic has nothing to say 
about the effects on the system. This work, although semantically rich, has not been de- 
signed with an operational framework in mind. Nevertheless, Dignum et aids approach 
is detailed in the kinds of deadlines and constraints that it allows to be modeled. For 
example, deadlines like . . . as soon as possible, which cannot be modeled in our gram- 
mar, can be modeled using theirs. We have, however, a system that is closer to being 
operationalized than theirs. 

Business Protocols. Grosof et al. [10] develop semantics for systems that represent 
business rules using Courteous Logic Programs. Their approach uses explicit rules to 
use to decide between conflicting rules, and the emphasis is on the implementation and 
application of CLP to real businesses. The work however does not define the seman- 
tics for the grammar they use. Business rules are represented by general if-then clauses. 
Hence, all concepts beyond the structure of the if-then rules are domain specific, includ- 
ing the temporal references. Their work, however, has been applied to actual business 
systems, and proves the value of intelligent agents in business processes. 

From the standpoint of real-world implementation, the Business Transaction Proto- 
col proposed by the OASIS [11] addresses the need for long-lived interactions among 
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web services as opposed to short-lived transactions in the classical sense of the word. 
This protocol also recognizes that many of the offers made in real-world businesses 
involve deadlines. For example, an airline participating as a web service provider in a 
BTP run could specify how long it is willing to hold an offer open when it agrees to 
take part in a transaction [12]. 

Commitment Operations. Our scheme for temporal quantification is related to a similar 
notion presented by Fornara and Colombetti [2], Their approach seeks to operationalize 
commitments and define the life cycle of a commitment, but does not pay attention to 
the issue of deadlines and temporally sensitive commitments, whereas our approach de- 
velops interesting results about the resolution of fulfillment of commitments. Although 
Fornara and Colombetti have taken a good first step towards the operationalization of 
commitments in agent interaction, they stop short of developing a semantics for tempo- 
ral commitments as we have done here. 

Verdicchio and Colombetti [13] develop a theory of the evolution of commitments 
over time. Their work is closest to ours among all others discussed in this section. It 
specifies constraints on the creation and satisfaction of temporal commitments using a 
variant of CTL*. 

McBurney and Parsons [14] define the Posit Spaces Protocol , which is an agent 
interaction protocol. This protocol uses a central repository for storing all the commit- 
ments that have been made. The idea is simple and corresponds to the concept of the 
Sphere of Commitment that was proposed by Singh [15]. This work does not relate 
to our results directly, but is yet another demonstration of the use of commitments in 
modeling and building multiagent systems. 

5.2 Directions 

Our work on temporal aspects of commitments is far from complete. One direction for 
further research is to investigate ways to do away with rigid protocols by having agents 
commit to each other by taking small risks at a time to finally arrive at a state where 
both parties are committed so that there is no risk of a loss to one party. This is just an 
intuitive notion, and further work is required to assess the viability of this approach. 

Another direction is to describe agent interaction protocols using the theory of uni- 
versal causation developed by Giunchiglia et al. [16], so that commitment machines 
that exploit nonmonotonic reasoning can be automatically generated. A commitment 
machine, proposed by Yolum and Singh [1 ], is a novel way of representing interaction 
protocols using commitments. It specifies the states that are allowed in the interaction 
in terms of the commitments that hold in those states. A commitment machine allows 
greater flexibility in the enactment of a protocol since interacting agents have many 
ways of reaching final states. Nonmonotonic reasoning in commitment machines would 
allow greater flexibility as compared to the incremental inferencing approach used by 
Yolum and Singh [1], Chopra and Singh [17] present this idea in greater detail and 
clarity. 

Venkataraman and Singh [3] develop a vector-clock based scheme to verify agents’ 
compliance to a commitment protocol. However, they do not consider rich temporal 
structures as we have done here. It will be interesting to apply the theory developed 
here to their scheme of compliance checking. 
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Abstract. Protocols for multiagent interaction need to be flexible because of 
the open and dynamic nature of multiagent systems. Such protocols cannot be 
modeled adequately via finite state machines (FSMs) as FSM representations 
lead to rigid protocols. We propose a commitment-based formalism called Non- 
monotonic Commitment Machines (NCMs) for representing multiagent interac- 
tion protocols. In this approach, we give semantics to states and actions in a 
protocol in terms of commitments. Protocols represented as NCMs afford the 
agent flexibility in interactions with other agents. In particular, situations in pro- 
tocols when nonmonotonic reasoning is required can be efficiently represented in 
NCMs. 



1 Introduction 

A protocol is a means of achieving meaningful interaction. Agents that constitute a 
multiagent system use protocols to guide their interactions with each other. Protocols 
have traditionally been specified as FSMs that specify sequences of states. The protocol 
designers have certain scenarios in mind that they directly incorporate in an FSM. As a 
result, agents using a protocol specified as FSMs are limited to behaving in a rigid man- 
ner. Such agents cannot handle exceptions or take advantage of opportunities that might 
arise during interactions with other agents. In this paper, we present an alternative way 
of specifying protocols that is based on commitments which we formalize below. Our 
approach is based on the general notion that an agent does not violate a given protocol 
as long the agent does not violate the commitments prescribed by the protocol. Using 
commitments makes the protocol flexible and enables the agent to handle exceptions 
and opportunities without violating the given protocol. 

Protocols for interaction in multiagent systems often resemble protocols routinely 
used by humans in their social interactions. The Contract Net [1] and NetBill [2] are ex- 
amples of such protocols. Such protocols have traditionally been represented by FSMs 
that represent sequences of states and transitions. Since FSMs are a low level represen- 
tation, it becomes cumbersome to capture multiple scenarios in an FSM. Thus FSMs 
designed by hand tend to be rigid and do not allow scenarios other than the specified 
“normal" ones. A protocol transitions from state to state as a result of the actions of 
the interacting agents. A transition is usually labeled with the actions that cause it. In 
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an FSM, the states and the actions in the protocol are meaningless tokens. The agent is 
limited to executing one of the sequences of actions hard-coded by its designer. These 
sequences represent the only legal behaviors of the agent. Anything the agent does out- 
side of this protocol is considered a violation. This makes the protocol inflexible and, 
therefore, undesirable in open multiagent systems where agents are autonomous and 
heterogeneous, and opportunities and exceptions need to be handled appropriately. 

Acting flexibly presupposes reasoning about the protocols. Reasoning formally pre- 
supposes that the protocols have a formal semantics. We base our semantics on the 
notion of commitments. Protocols can naturally be seen as an exchange and manipula- 
tion of commitments. A commitment is a directed obligation from one agent to another 
for achieving or maintaining a state of affairs. A commitment is social because it in- 
volves two parties and is publicly observable by all the agents in the agent society. 
Since a commitment is public, it is also possible to verify whether an agent has ful- 
filled its commitment, thereby making it possible to check an agent’s compliance with 
a protocol. 

This paper uses the NetBill e-commerce protocol [2] as a running example through- 
out the paper. Figure 1 shows an FSM representation of a NetBill simplified to focus on 
the core part. The customer, represented by c, sends a request for offers to the merchant, 
represented by m. The merchant sends an offer in response. If the customer accepts the 
offer, the merchant sends the goods. The customer then sends the payment for the goods 
in return for which the merchant sends a receipt. The only execution scenario possible 
in the protocol starts with the customer sending a request and ends with the merchant 
sending a receipt. The FSM in Figure 1 does not accommodate scenarios that would 
arise naturally in open and dynamic multiagent systems and is, therefore, unnecessarily 
rigid. Protocols for multiagent interaction should be flexible in the following ways: 

- Autonomy: A protocol specification should not impinge on the autonomy of an 
agent beyond the essential nature of the interaction it describes. Consider a scenario 
where a customer wants to buy goods from a merchant. A desirable specification 
should not limit the autonomy of an merchant by preventing him from advertising 
his wares by sending an offer message prior to receiving a request for offers. 

- Opportunities: A protocol should enable an agent to take advantage of opportuni- 
ties that may arise. For example, if a merchant advertises an attractive deal to a 
customer, the customer should be able to entertain this offer. 

- Exceptions: A protocol should enable an agent to deal with exceptions instead 
of aborting the interaction altogether. For example, a customer who doesn’t have 
enough money might delegate a commitment to pay the merchant to some other 
agent. This is not allowed in the NetBill protocol as specified above. 

The above forms of flexibility can be achieved only if we are able to reason about the 
content of the states and actions in a protocol. Often, the essential element of content in 
many protocols is the commitments of the different parties in a protocol. Specifically, 
we claim that if the protocol representation uses commitments and the criterion for 
protocol compliance is the satisfaction of commitments, then the above scenarios would 
be valid behaviors in the protocol. 

We propose a formalism for specifying protocols called Nonmonotonic Commit- 
ment Machines (NCMs) that uses commitments for representing states and actions. 
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Fig. 1 . FSM representation of the simplified NetBill protocol 

The meaning of a state is given by the commitments that hold in that state; a state is 
a description of the world. The meaning of an action is given by how it manipulates 
commitments. An NCM does not directly specify sequences of states and transitions. 
Instead, it specifies rules in Nonmonotonic Causal Logic (NCL) [3]. These rules model 
the changes in the state of a protocol as a result of execution of actions. The inference 
mechanism of NCL computes new states at runtime. Yolum and Singh [4] first studied 
commitment machines. They did not consider situations during the execution of a pro- 
tocol when agents must act with incomplete information. In such situations, the agent 
would need nonmonotonic or defeasible reasoning. Since NCL supports nonmonotonic 
reasoning, NCMs can express defaults in a protocol in a natural manner. Protocols rep- 
resented as NCMs are more elaboration tolerant [5] than those represented using clas- 
sical logic or FSMs. We develop a causal theory of commitments and represent the 
NetBill protocol as an NCM using that theory. 

The rest of the paper is organized as follows. Section 2 motivates the need for non- 
monotonic logic for protocol representation and describes the NCL that we employ for 
this purpose. Section 3 provides a description of commitments. Section 4 formalizes 
commitments and NetBill in this logic. Section 5 discusses the relevant literature and 
section 6 discusses future directions. 

2 Nonmonotonic Causal Logic 

In logic, the consequence relation h is a relation between sets of propositions and in- 
dividual propositions. A h x, where A is a set of propositions and x is a proposition, 
means that x is a logical consequence of A. Classical logic is monotonic meaning that 
if A C B, where A h x and B is a set of propositions, then B hi. Informally, mono- 
tonicity means that the addition of new information does not invalidate old information. 
Therefore, making rules defeasible in the face of change poses difficulties. Consider 
Example 1. 
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Example 1. A customer may not return goods received and should pay for them. How- 
ever, if the goods are damaged, then the customer may return the goods and then cancel 
his commitment to pay. 

Example 1 involves defeasible reasoning. The default rule is that the commitment to 
pay cannot be canceled. (A more accurate rule is that the debtor of a commitment can- 
not cancel his commitment. However, the above simplified version is adequate for this 
example.) The general rule is defeasible, i.e., if a special condition applies, like when 
the goods are damaged and they are returned, the commitment can be canceled. The 
default rule should be applicable when no information about the condition is available. 
This is the essence of nonmonotonic reasoning. Such defeasible reasoning is beyond 
the realm of classical logic. 

2.1 Introduction 

To overcome the aforementioned difficulties in specifying protocols, we use NCL. We 
choose NCL because it has an intuitive syntax and semantics. The language C+ [3], 
which is based on NCL, has a semantics based on state transition systems which agrees 
with our intuition about protocols. Lurther, it has been shown to be elaboration tolerant 
[6]. NCL is a logic of universal causation meaning that every fact that is caused holds 
and every fact that holds is caused. Universal causation is not so much a philosophical 
stance as a practical one, as universal causation yields a uniform semantics for causal 
theories. Taking this stance makes NCL suitable for simulation and planning since ev- 
erything can be explained. Also, in our domain, a commitment holds or does not hold 
only because there is a reason for it to hold or not hold. Moreover, as we show below, 
universal causation can be disabled for selected formulas. 

The signature of a causal theory is the set a of symbols called constants. Each 
constant c is assigned a nonempty finite domain Dom(c) of symbols. An atom is of the 
form c = v where v G Dom(c). An interpretation of a is an assignment c = v for each 
c G a where v G Dom(c). Since we consider only boolean atoms, either c = true or c 
= false. A formula in NCL is a combination of atoms using the connectives of classical 
logic. A causal rule is of the form F <= G, where F and G are formulas of classical logic 
and are called the head and the body of the rule, respectively. This means that there is 
cause for F to be true if G is true. It does not say that G is the cause for F. This reflects 
the intuition that it is sufficient to know the conditions under which a fact is caused. 
As an example, consider a switch S that, when closed, lights two bulbs A and B. Even 
though A being lit is not the cause for B being lit, it is correct to say that there is a cause 
for B to be lit when A is lit. A theory in NCL consists of a set of causal rules. 

Constants in causal theories are either fluents or actions. A causal theory describes 
histories of length m+ 1, (m > 0), by creating for all i, (i G {0, . . ., m}), a copy of every 
fluent and, for all i, ( i G {0, . . ., to — 1}), a copy of every action. The interpretation of 
fluents for a particular i represents state .s, and the interpretation of actions in state s, 
represent the transition to state Sj+i. 

2.2 NCL Semantics 

Models for formulas in NCL are defined in the same way as classical logic. An inter- 
pretation is a model of a set X of formulas iff it satisfies all the formulas in X. If every 
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model of X satisfies a formula F, then X entails F, or symbolically X \= F. We can now 
define models of a causal theory. Let / be an interpretation of the signature a of a theory 
T. The reduct T 1 is the set of all heads whose bodies are satisfied by the interpretation 

I. If I is also a unique model of T 1 , then I is a model of T. If I is not a unique model 
of the reduct, then some constant is missing from the reduct, and therefore there is no 
explanation for that constant. But NCL is a logic of universal causation, therefore, no 
constant should be unexplained. 

As an example, consider the following theory T\ consisting of rules Rl and R2. 

Rl. p 4= q 

R2. q <= q 

Rl says that there is a cause for p if q is true. R2 says that there is a cause for q. 
Reasoning informally using the principle of universal causation, we see that there is no 
cause for -p to be true. Therefore, p has to be true. Therefore, p must be caused. The 
only way p can be caused is if q is true. And, q is caused by R2. Therefore, the only 
possible interpretation for this theory is I = {p = true, q = true}. 

Intuitively, T 1 represents facts that are caused, according to theory T under inter- 
pretation I. If a causal theory T has a model I, we say that it is consistent or satisfiable. 
If all models of T satisfy a formula F, that means T entails F or 7’ |= F. 

Coming back to our example theory T\ with rules Rl and R2, we consider all the 
possible interpretations to see which, if any, is a model of T\ : 

1. I 1 = {p = true, q = true}: T 1 / 1 = {p, q}. I\ is a unique model of T/ 1 . Therefore, I\ 
is a model of T\. 

2. 1 2 = {p = false, q = true}: Tl 2 = {p, q}. I 2 is not a model of T/ 2 . Therefore, I 2 is 
not a model of T\ . 

3. I 3 = {p = true, q = false}: T ( 3 = {}. T ( 3 has no models. Therefore, I 3 is not a 
model of T\ . 

4. / 4 = {p = false, q = false}: T ( 4 = {}. T [ 4 has no models. Therefore, I 4 is not a 
model of T\ . 

Note that I 1 = {p = true, q = true} is the only model of this theory that matches the 
result of our informal reasoning. 

To see how NCL is nonmonotonic, consider a theory T 2 consisting of the rule c = 
1 4= c = 1. This rule is like a default rule. The only model of T 2 is 1(c) = 1. Now 
consider a theory X 3 such that it had two rules, c = 1 <= c = 1 and c = 2 4= true. 
Note that T 2 C T 3 . However, the only model of T 3 is 1(c) = 2. 

2.3 Action Descriptions in C+ 

Recall that C+ is a high level action description language based on NCL. It is easier 
to specify theories in C+ than directly in NCL because of it’s concise notation. Before 
we describe the syntax and semantics of NCL and C+ formally, we describe informally 
the meanings of some of C+ rules that we use later. A formula in C+ is a propositional 
combination of constants which could either be action constants or fluent constants. 
Actions in NCL are interpreted to be unit-length. This paper is restricted to boolean 
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constants. An action constant being true represents the execution of the action. The 
meanings of the rules we use follow. 

- a causes b, where a and b are actions. 

This means that action a causes action b and both happen concurrently. 

- a causes f, where a is an action and/is a fluent. 

This means that action a causes/to hold in the next state. 

- A A F causes b, where A is a conjunction of actions, b is an action and F is a 
conjunction of fluents. 

This means that in a state where F is true and actions in A happen, then action b 
happens concurrently. 

- a causes a , together with the rule —a causes —a, where a is an action. 

These two rules mean that there is a cause for a and there is a cause for ~^a respec- 
tively. In other words the a is exogenous, it simply happens or does not happen. 
Without these rules, the action is not exogenous. Universal causation is disabled 
for these rules. 

- a may cause f, where a is an action and/is a fluent. 

This means that /may hold after a’s execution if it does not already hold. Thus, this 
rule expresses nondeterminism. 

- caused a after / where a is an action and/is a fluent. 

This means that/causes a in the same state. Notice that this rule uses the caused 
form and not the causes form because no suitable formulation in terms of causes 
exists. This rule expresses the causation of an action and differs from the rule a 
causes f above which expresses the causation of a fluent. 

- Fluents are declared as inertialFluents meaning that their assignment persists from 
one state to the next unless changed by some other rule. 

2.4 Translating C+ to NCL 

Let’s describe C+ formally and show the translation from rules in C+ to rules in NCL. 
C+ includes three kinds of rules, namely, static rules, fluent dynamic rules and action 
dynamic rules. A fluent can be either a statically determined fluent or a simple ( dynamic ) 
fluent. Static fluents can appear in the heads of only static rules. A fluent formula con- 
sists only of fluents. An action formula consists only of actions. A static rule is an 
expression of the form 

R3. caused F if G 

where F and G are fluent formulas. Static rules express indirect effects of actions 
that are instantaneous with respect to the causal fluent formula. A dynamic rule is an 
expression of the form 

R4. caused F if G after H 

It is called an action dynamic rule if F and G are both action formulas. It is called a 
fluent dynamic rule if F and G are both fluent formulas. Action dynamic rules express 
the causation of an action and fluent dynamic rules the causation of a fluent. 

An action description D consisting of such rules is turned into a causal theory l) m 
where m is the length of the history. The signature a of D m then consists of 
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1 . i:c for every fluent c £ a for every i € {0, . . m} 

2. i:c for every action c £ a for every i £ {0, . . m-1} 

The domain of i:c is the same as Dom(c) and i:F means i is inserted in front of every 
occurrence of every constant in F. The rules of D m are then: 

R5. i.F 4= i.G, for every static rule R3 in I) and every i £ {0, . . m } ; 

R6. i.F 4= i:G A i:H, for every action dynamic rule R4 in D and every i £ {0, . . ., 
m-1}-, 

R7. i+l:F <= i+l-.G A i:H, for every fluent dynamic rule R4 in D and every i £ {0, 

. . m-1}-, 

R8. 0:c = v 4= 0:c = v, for every simple fluent constant c £ a and every v £ Dom(c). 
(Notice that every simple fluent has all possible values in the initial state and 
therefore, they are exogenous in the initial state. Thus, universal causation is 
disabled for them.) 

All examples that follow are in the abbreviated C+ notation. We list the relevant C+ 
abbreviations below, extracted from [3]. 

A1 . A dynamic rule of the form caused F if true after H abbreviates to II causes F. 

A2. An action dynamic rules of the form caused F if G after H abbreviates to G A H 
causes F 

A3. The action dynamic rules caused a if a and caused —a if~>a where a is an action, 
together abbreviate to exogenous a. In C+ such an action is called an exogenous- 
Action. 

A4. The fluent dynamic rules caused p if p after p and caused ~>p if ~>p after — 
where p is a fluent together abbreviate to inertial p 

A5. The fluent dynamic rule caused F ifF after H abbreviates to H may cause F. 

3 Commitments 

Commitments among agents have been recognized as a fundamental notion in cooper- 
ative problem solving [7-9]. Castelfranchi [10] and Krogh [11] present, respectively, a 
social and logical perspective on commitments. In our work, we do not reason about 
the commitments from the point of view of cooperation among agents. We use com- 
mitments to specify protocols. As agents interact with each other using some protocol, 
they create and manipulate commitments. The breach of a commitment represents a 
violation of a protocol. The agent that is bound to fulfill the commitment is called the 
debtor of the commitment. The agent that is the beneficiary of the commitment is called 
the creditor. 

Definition 1. A base-level commitment C(x,y,G,p) binds a debtor x to a creditor y for 
fulfilling the condition p in context G. 

Definition 2. A conditional commitment CC(x,y,G,p,q) denotes that if a condition p is 
brought about, then the commitment C(x,y,G,q) will hold. 
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Both commitments and conditional commitments are created in a context G, which 
can be thought of as an institution or society whose rules are binding on the agents 
that join it. The context also defines the meanings of the terms used in the context. 
Henceforth, we omit G to reduce clutter. 

Singh[12] lists operations for the creation and manipulation of commitments. These 
operations cannot be arbitrarily carried out. They are subject to metacommitments that 
are rules that govern the commitment operations and are part of the context G. The 
operations are listed below. 

- Create(x,y,p) creates a new commitment C(x,y,p). 

- Discharge(x,y,p) discharges the existing commitment C(x,y,p) so that it no longer 
holds. 

- Cancel(x,y,p) cancels the existing commitment C(x,y,p) so that it no longer holds. 

- Delegate(x,y,p,z) delegates the commitment C(x,y,p) to a new debtor z. More specif- 
ically, the original commitment C(x,y,p) no longer holds and a new commitment 
C(z,y,p) is created in its place. 

- Assign(x,y,p,z) assigns the commitment C(x,y,p) to a new creditor z. More specif- 
ically, the original commitment C(x,y,p) no longer holds and a new commitment 
C(x,z,p) is created in its place. 

- Release(x,y,p) releases the debtor .v from the commitment C(x,y,p) so that the com- 
mitment no longer holds. 



4 NCM Representation of Protocols 

In our approach, we represent protocols as NCMs. An NCM is a causal theory in C+ 
that consists of two parts. The first part is a protocol-independent causal theory of com- 
mitments in which we capture the representation of commitments and the operations on 
them. The second part is protocol specific and includes constants and rules describing 
the given protocol’s domain. The distinction between the two parts is only to separate 
out the domain independent part, logically they form a complete causal theory as we 
shall see later. We first present the theory of commitments and then model the NetBill 
protocol in causal logic. Together they represent the NetBill NCM. 

4.1 Commitments in Causal Logie 

We represent commitments and operations on them in the causal logic. Commitments 
in causal logic are declared to be constants of the type inertial fluents. 

- C(x,y,p), CC(x,y,p,q) :: inertialFluents 

where x and y are variables of the sort agent and p and q are variables of the sort 
condition. By declaring commitments to be inertialFluents , we include rules of the 
kind A4 for each commitment. Conditional commitments are declared as CC(x,y,p,q ) 
where q is also a variable of sort condition. Conditional commitments are also declared 
as inertialFluents. For each of the operations on commitments listed in Section 3, there 
is a declaration of the form 




Nonmonotonic Commitment Machines 



191 



- ( Operation ) :: action 

Constants of type action are not exogenous, that is, rules of the form A3 are not included 
in the theory. Their execution, therefore, has to be caused by other actions or fluents. 
We add two more operations for handling conditional commitments. 

- CDischarge(x,y,p,q), CCreate(x,y,p,q) :: action 

The following rules capture the meaning of the operations: 

R9. Create(x,y,p) causes C(x,y,p) 

RIO. Discharge(x,y,p) causes —C(x,y,p ) 

R1 1. Cancel(x,y,p) causes -> C(x,y,p ) 

R12. Delegate(x,y,p,z) causes ~^C(x,y,p) & C(z,y,p) 

R13. Release(x,y,p) causes —C(x,y,p ) 

R14. Assign(x,y,p,z) causes ~<C(x,y,p) & C(x,z,p) 

R 1 5 . CCreatef x,y,p, q ) causes CC(x,y,p, q ) 

R16. CDischarge(x,y,p,q) causes ~<CC(x,y,p,q) & C(x,y,q) 

All the variables are grounded such that x / y and p / q. We omit the rules specifying 
the grounding of the variables. Since we want the operations to be caused by other 
things, then in those states where an operation is not caused, there must be a reason for 
it to be not caused. In other words, we want the operations to be partially exogenous. 
So for each of the commitment operations, we include rules of the form 

R17. -i( Operation ) causes ->( Operation ) 

An example is the rule —Create(x,y,p ) causes ~^Create(x,y,p). The specification also 
includes rules to capture the restriction that no two commitment operations are concur- 
rent. 

4.2 NetBill Specification in Causal Logic 

We now represent the NetBill protocol in causal logic. The following rules together with 
the theory of commitments given above represent the specification of NetBill NCM. We 
declare 

- m, c to be of the sort agent 

- goodsc, payc, acceptc, receiptc to be of the sort condition 

- request, offer, accept, goods, pay, receipt to be inertialFluents 

- SendRequest, SendOjfer, SendAccept, SendGoods, SendPayment, SendReceipt to be 
exogenous Actions. 

The meanings of the above constants are as their name indicates. Conditions are an 
artifact of conditional commitments. We assume that all fluents and actions have unique 
identifiers. We have the following rules. 

R18. SendRequest causes request 
R19. SendOjfer causes offer 
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R20. SendOjfer causes CCreate(m, c, acceptc, goodsc) 

R2 1 . SendAccept causes accept 
R22. SendAccept & CC(m,c, acceptc, goodsc) 
causes CDischarge( m,c, acceptc, goodsc) 

R23. SendAccept causes CCreate(c,m, goodsc, payc) 

R24. SendGoods causes goods 

R25. SendGoods causes CCreate(m, c, payc, receiptc) 

R26. SendGoods & CC(c, in, goodsc, payc) causes CDischarge(c,m, goodsc, payc) 

R27. SendGoods & C(m,c, goodsc) causes Discharge(m,c, goodsc) 

R28. SendPayment causes pay 

R29. SendPayment & CC(m,c, payc, receiptc) causes CDischarge(m,c, payc, receiptc) 
R30. SendPayment & C(c,m,payc) causes CDischarge(c,m,payc) 

R3 1 . SendReceipt causes receipt 

R32. SendReceipt & C(m,c, receiptc) causes Discharge(m,c, receiptc) 

In our representation no two commitment operations are concurrent. Also, no two 
protocol actions are concurrent. However, when a protocol action causes a commitment 
operation, they are concurrent. By rule R6, the interpretation of ActionA causes ActionB 
is such that ActionA and ActionB are concurrent. This ensures that the protocol action 
is concurrent with the commitment operation it causes is satisfied. There could be other 
concurrency models possible for NetBill. We choose this one because of its simplicity. 

We now add rules for our motivating example, Example 1, to this protocol specifi- 
cation. We introduce SendReturn and SendGoods as exogenous actions. We introduce a 
new action Ab and a new fluent damagedGoods to indicate that the goods are damaged. 
We also include the nondeterministic rule R34 to say that as a result of the SendGoods 
action, the goods may be damaged. Rule R33 captures the condition that the cancel op- 
eration is not allowed for any commitment. Rule R35 however says that a commitment 
can be canceled under abnormal conditions. Rules R36 and R37 ensure that Ab is false, 
except when damagedGoods is true. We add the Rules R35 — R37 to accommodate 
Example 1. Rules R33 and R34 are already in the theory. Ab is an action because it has 
no meaning in the states. It acts as a qualifier for the exogenous action SendReturn. 

R33. ~^CanceI(x,y,p) causes ~^Cancel(x,y,p ) 

R34. SendGoods may cause damagedGoods 

R35. SendReturn A C(c,m,pay) A Ab causes Cancel(c,m,pay) 

R36. caused Ab if damagedGoods 
R37. ~^Ab causes ~^Ab 

The theory also contains rules that place constraints on the actions so that the ex- 
ecution of actions makes sense. For example, we specify that the SendRequest action 
cannot happen after the payment has been made. Rules 

4.3 Executing NetBill in CCalc 

CCalc (Causal Calculator) is a reasoning tool that implements causal logic. Given a 
causal theory and a goal in the form of a query, CCalc finds paths to the goal. We load 
the NetBill NCM into CCalc and pose queries one after the other. CCalc then finds 
paths to satisfy each query. 
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: - query 
label: :0; 
maxstep:: 3; 

0 : -offer, 

-accept, 

- returned, 
-goods, 

- C (x, y , p) , 

- CC ( x , y , p , q ) , 

-pay, 

- request , 

- receipt , 

- damagedGoods ; 
maxstep: returned. 

Fig. 2. Example Query 



Figure 2 shows an example query. This query asks for an execution sequence of 
three of fewer steps, beginning from a state in which all fluents are false, in which the 
goods have been returned. Running this query in CCalc produces the output as shown 
in Figure 3 (formatted for readability). 

The action SendAccept is caused which in turn causes the CCreate which creates 
the conditional commitment that if the goods are sent then the customer will pay. The 
result of these actions is reflected in state 1 . SendGoods is then caused which reflects 
the fact that the merchant has sent the goods. SendGoods causes the discharge of the 
conditional commitment created by the customer resulting in the customer’s commit- 
ment to pay. SendGoods also creates a new conditional commitment that if the customer 
pays then the merchant will send the receipt. This example is interesting as SendGoods 
also causes damagedGoods, resulting in state 2. damagedGoods causes Ab in state 2. 
So the SendReturn action is successful, which in turn causes the cancellation of the 
commitment to pay. State 3 is the resulting state which also satisfies the goal state of 
our query. 

5 Discussion 

Our focus in this work is to develop meaningful representations of agent communication 
protocols. We do so by using commitments to declaratively represent states and actions. 
This gives our representation a verifiable semantics [13]. By using commitments to 
model protocols, we constrain protocols no more than is necessary. We have highlighted 
the need for a nonmonotonic logic for commonsense reasoning in protocols. We used 
NCL towards this end and showed how a protocol can be represented in NCL. Like 
the NCL, there are a few other noteworthy formalisms for reasoning about action and 
change. Dynamic Logic [14] is a modal logic augmented with an algebra of regular 
events. Flowever, it is monotonic and therefore not suitable for our purposes. Event 
Calculus [15] and situation calculus [16] have been extended with circumscription to 
enable nonmonotonic reasoning. 
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Solution : 
State 0: 
ACTIONS : 

State 1: 
ACTIONS : 

State 2: 

ACTIONS : 
State 3: 



CCreate (c,m, goodsc,payc) 
SendAccept 

CC(c,m,goodsc,payc) accept 

CCreate (m, c,payc, receiptc) 
CDischarge (c,m, goodsc, payc) 
SendGoods 

C (c,m, payc) CC (m, c,payc, receiptc) 
accept goods damagedGoods 

Cancel (c , m, payc) Ab Return 

CC (m, c, payc, receiptc) accept 
goods returned damagedGoods 



Fig. 3. Answer 



Both FSMs and NCMs are formal representations of protocols. Both approaches are 
verifiable and in the case of an FSM, trivially so. An NCM, though, represents meaning. 
Agents that can reason about commitments take actions accordingly. For example, if, 
as part of a particular protocol, an agent enters into a commitment, then the agent can 
plan its actions, even those not directly related to the protocol, so that the commitment 
is never violated. Alternate paths through the protocol may be selected based on criteria 
like safety or number of messages exchanged. For example, an agent can adopt an ap- 
proach where it does not commit unless another agent also commits for some desirable 
condition. Also it is not convenient to express defaults in FSMs. In fact, the defaults 
wouldn’t be obvious at all in an FSM. FSMs are also not as elaboration tolerant as 
NCMs. 

5.1 Conventional Protocols and Protocol Modeling 

Conventional protocols like TCP/IP, RPC, HTTP, and so on have a well-defined envi- 
ronment and scope. Their focus is on the correct delivery of data and they are therefore 
strict in the sense that they prescribe all paths for correct execution as well as for er- 
ror recovery. Modeling such protocols as FSMs is usually sufficient. Notable among 
other formalisms for modeling protocols are Petri Nets [17] and statecharts [18]. Petri 
Nets have proven to be especially useful in modeling concurrency. Petri nets specify 
transitions (T-elements) between places (P-elements) which are sets of conditions. Stat- 
echarts is a visual formalism that extends state machines by adding support for hier- 
archy, concurrency and communication. Statecharts provide the designer the power to 
cluster states into super-states as well as refine states, thereby leading to compact rep- 
resentations for complex behavior. Statecharts can also represent defaults. However, 
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statecharts also specify the transitions between states. As such, neither petri nets nor 
statecharts afford much flexibility. Statecharts though, are easier to comprehend be- 
cause of the ability to cluster and refine states unlike NCMs where the specification 
may become difficult to manage as the number of rules increase. 

5.2 Commitments 

Commitments have been studied in the context of distributed problem solving and co- 
ordination. Bratman [7] argues that for shared cooperative activity, among other things, 
commitment to joint activity and commitment to mutual support are required. Grosz 
and Kraus [9] investigate the formulation of shared plans for coordinating group action. 
In their framework, an agent can adopt two types on intentions, intend-to and intend- 
that, that commit the agent to an action and state of affairs respectively. Jennings [ 19] 
presents commitments as a fundamental notion for efficient coordination in distributed 
systems. Jennings also mentions conventions which monitor the commitments and state 
when a commitment may be reassessed. Jennings further reformulates different models 
of coordination in terms of commitments. A distinguishing feature of all of the above 
work is that they present commitments as a mentalistic notion, assuming a system of 
cooperative agents. Shoham’s agent oriented computing paradigm [8] introduces obli- 
gation as a modality required to describe the mental state of an agent. Sandholm and 
Lesser [20] study automated negotiation among self-interested agents whose computa- 
tions are resource bounded. They argue that protocols that have leveled commitments, 
that is, when commitments vary from breakable to unbreakable in a continuum by as- 
signing a function to evaluate the cost of breach of each commitment, are more suitable 
for contracts than full commitment protocols. Krogh [11] examines the possibility of 
using of deontic logic for analyzing multiagent systems. Castelfranchi [10] presents 
an ontology of commitments with the aim of understanding organizational activity. He 
defines social commitment as a relation between two or more agents and discusses its 
various aspects. Singh [13] defends a commitment-based social semantics for agent 
communication. 

We do not study commitments from the point of view of coordination. However 
as we pointed out earlier in this section, an agent can reason about its future actions 
depending upon the commitments it already has or ones that the agent might have to 
make in the future. Also, we do not specify that commitments necessarily have to be 
represented in an agent’s state. We are exploring the possibility of compiling NCMs 
into FSMs that have no representations of commitments (see section 6 for details). We 
are also not concerned directly with the economic impact of the breach of a commit- 
ment. Though we specify protocols in a logical language using commitments, we do 
not present a deontic logic. Social commitments, that is, directed obligation from one 
agent to another, represent the cornerstone of our research. Our scope is limited to the 
flexible specification of protocols and their verification. 

5.3 Protocols 

Yolum and Singh [4] proposed commitment machines and also showed how event cal- 
culus can be used to represent protocols and generate new paths in the protocols [21 ]. 
They do not consider commonsense reasoning situations. Koning and Huge! [22] de- 
scribe a methodology for designing interaction protocols for multiagent systems. In 
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their work, the focus is on reusability and modularity of protocols. They achieve this 
by composing protocols out of microprotocols using a formal language called Commu- 
nication Protocol Description Language (CPDL). A formula in CPDL corresponds to 
an edge going from an initial state to a final state. The edge is labeled with a sequence 
of microprotocols which makes the protocol quite rigid. Koning and Huget also do not 
consider which microprotocols can be composed. This job is presumably left to the de- 
signer. With our commitment-based approach, it is possible for the agent to determine 
which protocols can be composed by looking at the states in the protocol. 

5.4 Agent Communication Languages 

An agent communication language (ACL) allows agents of heterogeneous designs to 
interact. Developing a semantics for agent communication languages (ACLs) that is ex- 
pressive and verifiable has been a long standing goal of the agent community. To be 
verifiable, an ACL should have social semantics [23]. Earlier efforts at standardization 
of ACLs like Arcol [24] and KQML [25] promoted mental agency, that is, they were 
based on mental concepts like beliefs and intentions, and were therefore, not verifiable. 
The ongoing efforts at standardizing ACLs are based on communicative acts. The prob- 
lem with giving a communicative acts based semantics is that it is not clear what the 
meanings of the communicative acts should be. Also it is not clear how many commu- 
nicative acts are needed. Another challenge is relating the communicative acts to the 
conversation in which they occur. By giving semantics to protocols directly we are able 
to give a simple, operational characterization of protocols without getting bogged down 
with the above issues. 

Dignum and van Linder [26], and Guerin and Pitt [27] define ACLs in term of 
communicative acts. Dignum and van Linder’s framework considers four components, 
the information component, the action component, the motivational component and 
the social component as constituting an agent framework and formally describes and 
relates them. The framework is developed in dynamic logic which is monotonic. They 
postulate a COMMIT communicative act, but other communicative acts have mentalistic 
preconditions which makes the framework suitable only for a system of cooperative 
agents. 

Guerin and Pitt propose an ACL specification in which declarative ACL specifi- 
cations are given procedural interpretations. An ACL specification in their approach, 
consists of three parts: a Converse Function that specifies permissions and obligations 
for subsequent speech acts based on the conversation state, a Protocol Semantics that 
captures protocol dependent meanings of speech acts, and a Speech-Act Semantics that 
give the protocol independent part of the meaning. It is not clear how useful it is to 
model the protocol-independent part of the meaning, since most meaning comes from 
the protocol. 

6 Future Directions 

Our main aim in this work is to come up with protocols that have verifiable semantics 
and constrain the agent no more than to the extent necessary to carry out legal interac- 
tions. To carry our work further, we have identified the following future directions. 
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Fig. 4. FSM representation of the extended NetBill protocol 



Compiling NCMs into FSMs: The preceding sections present commitment-based se- 
mantics for interaction protocols and show how an agent can reason with commitments. 
However, it is not necessary that agents be able to reason about commitments to exe- 
cute a protocol. For some applications efficient execution might be important. It may 
also be the case that an agent designer wants to exclude certain risky or lengthy paths 
(behaviors) in the protocol. For such agents, it would be useful if we could compile an 
NCM into an FSM. The complete NCM need not be compiled into an FSM. A designer 
could selectively compile behaviors (sequences of transitions) from an NCM into an 
FSM. The FSM can then be executed without an inference engine. Another advantage 
of compilation instead of directly designing or extending an FSM is that it is not al- 
ways clear how to add states and transitions to an FSM. Compilation from an NCM 
makes this process automatic. Figure 4 shows an example FSM for NetBill, that may 
be compiled from the action description presented in Section 4.2. 

Since C+ action descriptions have a transition system semantics, it should be rela- 
tively straightforward to compile NCMs into FSMs. An important future direction is to 
formally define NCMs in terms of C+ action descriptions and present a procedure for 
compiling NCMs into FSMs. It is equally important to prove that the generated FSM 
is sound and complete with respect to the NCM it was compiled from. We are also in- 
vestigating ways to automatically compile FSMs from NCMs. Selective compilation is 
also a subject of future research. 

Protocol Specification: While our formulation of NCM has some desirable properties 
like declarative rules and elaboration tolerance, it lacks other properties desirable in 
protocols such as a comprehensible graphical representation, role bindings for agents 
and temporal model checking [28]. For example, we cannot prove satisfactorily if the 
NetBill NCM is correct and complete. A related piece of work is to compare NCMs 
with other formalism like statecharts and Petri Nets in more depth. Another interesting 
avenue to explore is the compilation of NCMs into more expressive graphical formalism 
like statecharts. 
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Protocol Distribution: In the NCM representation of NetBill, we did not specify roles 
in the protocol. However, an agent will be executing the role that it adopts in the pro- 
tocol. We want to formulate procedures based on symbol manipulation to distribute a 
centralized protocol among the various roles in the protocol. The result will be roles 
skeletons that are also NCMs. It is not clear what the specification of a role itself should 
be. Is a role just a label, or does it specify the required capability of an agent to adopt a 
role along with normative rules and authorizations that come along with the role? An- 
other technical challenge is proving that a distributed protocol is sound and complete 
with respect to the centralized protocol. 

Creating Role Skeletons from BPEL Flows: BPEL [29], the Business Process Exe- 
cution Language is a draft standard for specifying and coordinating business processes. 
Intuitively, it is a process flow graph with nodes as tasks and edges as messages. Busi- 
ness flows, as they are currently specified, are like FSMs in the sense that they have 
been designed with certain scenarios in mind. As such, they are quite rigid. To make 
business processes more flexible, we envisage the following design-time methodology. 

1. The designers start with an interaction diagram or state machine or some other 
graphical representation for the protocol that is to be modeled as a BPEL flow. 

2. The designers build NCM representation of the protocol with enhancements to 
make it more flexible and then partition the NCM into the role skeletons. 

3. The role skeletons are compiled into FSMs 

4. Compile FSMs into a BPEL flow. This should be easier than compiling an NCM 
into a BPEL flow. 

We plan to build a tool which incorporates this methodology. The tool would suggest 
enhancements to the designer for a given protocol, build an NCM based on the choices 
of the designer and compile it into an FSM and then, perhaps with annotations from the 
user, compile it into a BPEL flow. 

Verifying Strategies: It will often be the case that an agent is confronted with the 
problem of selecting between multiple paths that it can take to reach a goal. The agent 
selects a path based on some strategy. For example, if the customer does not trust a 
merchant, the customer might adopt a strategy where it never pays before receiving the 
goods. On the other hand, if it does, it could adopt a strategy where it accepts to buy 
goods for a certain price without even asking the merchant for offers or pays before 
getting the goods. It is possible to imagine more complex strategies. An interesting 
problem is the specification of strategies with respect to commitments and determining 
which protocols (more specifically, paths in the protocol) satisfy a given strategy. An 
agent can then inspect a role to check whether it satisfies its strategy. 

We conclude by saying that this work represents the first step in the development 
of a comprehensive methodology for designing flexible multiagent interaction proto- 
cols. Development of the protocol design tool that incorporates this methodology is the 
primary objective of this research. 
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Abstract. This article examines some of the issues in representation 
of, processing, and automated agent participation in natural language 
dialogue, considering expansion from two-party dialogue to multi-party 
dialogue. These issues include Some regarding the roles agents play in 
dialogue, interactive factors, and content management factors. 



1 Introduction 

Most formal and computational studies of natural language dialogue have consid- 
ered only the two-party case. E.g., communication between two people, a person 
and a dialogue system, or a pair of agents. In this article, we consider several 
issues in dialogue management, and how the nature of the problem changes 
when considering multiple participants. For many of these issues, we refer to the 
dialogue models in the Mission Rehearsal Exercise (MRE) Project [1,2]. The 
MRE project [3] uses virtual humans to help train decision-making in a team 
context, by allowing a human trainee to rehearse simulated missions, interacting 
with the virtual humans using spoken and multi-modal communication in an 
embodied virtual world. Each virtual human maintains its own model of a plan, 
goals, beliefs, team tasks, dialogue state, negotiation state [4], and emotional 
state [5]. Virtual humans can understand and talk to the human trainee, as well 
as other virtual humans (using an agent communication language modelled on 
the physical performance of speech, indicating the verbal and non-verbal infor- 
mation expressed and the timing of actions). In the initial, Bosnia scenario, the 
trainee plays the role of an Army Lieutenant platoon leader, facing a dilemma in 
a peacekeeping situation. The Lieutenant must communicate with a Sergeant, 
a Medic, and others including platoon members and local citizens as well as 
more distant units by radio. Since the trainee has considerable flexibility in how 
he chooses to communicate, and the aim is to immerse the user in a realistic 
simulation, many issues in multi-party and multi-modal communication must 
be addressed. 

2 Participant Roles 

There are a number of different types of participant roles that are important 
for dialogue interaction. These include both local roles that shift during the 
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conversation, such as speaker and hearer, roles tied to the activities that the 
dialogue is a part of, and more permanent social roles that transcend particular 
dialogues. 

2.1 Conversational Roles 

At the most immediate level, there are the conversational roles. For two party 
dialogue, there are the basic roles of speaker and listener/ addressee. When we 
consider multi-party communication, there are two related sub-issues: who can 
receive (is intended to receive) an utterance, and who is it addressed to. For 
instance an agent A might want to ask a question of agent B , but might also 
want C to hear the question as well. Likewise, D might also hear the question 
even though A had no intention for D to do so. There a number of types of 
other listener roles, including ratified by the speaker (intended to hear the com- 
munication) or not, known to be listening by the speaker, or not. Clark gives 
a taxonomy of some of these listener roles [6]. An additional consideration is 
whether the listener is in-context or out of context. An in-context listener (who 
has heard the previous relevant utterances) may interpret an utterance quite 
differently from one who comes in without this context (or worse, with a par- 
tial or different context). There are also roles that we can use to characterize 
agents with respect to a whole conversation, as well as a specific utterance. Ac- 
tive Participants may take up speaker and addressee roles in a conversation, and 
generally are engaged and attentive to the conversation. Overlrearers (who may 
be ratified or not) are also part of the conversation, in that they will receive 
and interpret the constituent utterances, and utterances may be planned with 
them in mind (either to facilitate or block understanding), but do not play a 
main part in the conversation. Finally some agents may be un-involved in the 
conversation. 

2.2 Speaker Identification 

In two-party dialogue, speaker identification is not a real issue - any speech that 
does not come from oneself must come from the other participant. In multi- 
party situations, it may not be so trivial [7]. If just a single audio stream is 
present, one can use a number of features as evidence for identifying speakers. 
These include acoustic features of the voice itself, as well as stylistic features, and 
self-identifications (in the case where one can trust the speaker to provide accu- 
rate information). If multi-modal information is available, additional cues can be 
used. E.g., stereo microphone arrays can localize the position of the speech, and 
thus give clues as to the speaker’s identity. Likewise, visual information (e.g., 
of lips moving or other speech-related gestures), can help an agent identify the 
speaker. When multiple agents are involved in dialogue, it can also be important 
to provide cues to others as to who is speaking. For agent-agent communication, 
it is easy to put identifying information in the message channel itself. For hu- 
mans, however, it may be helpful to provide other cues, such as different voices, 
and visual cues such as lip movement and gestures for the speaking agent’s body 
or avatar. 
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1. If utterance specifies addressee (e.g., a vocative or utterance of just a name when 
not expecting a short answer or clarification of type person) 

then Addressee = specified addressee 

2. else if speaker of current utterance is the same as the speaker of the immediately 
previous utterance 

then Addressee = previous addressee 

3. else if previous speaker is different from current speaker 
then Addressee = previous speaker 

4. else if unique other conversational participant 
then Addressee = participant 

5. else Addressee unknown 

Fig. 1. MRE Agent Speech Addressee Identification Algorithm 

2.3 Addressee Recognition 

In the two party case, like speaker identification, addressee identification is triv- 
ial: whoever is not speaking is the intended recipient of an utterance. In the 
multi-party case, we must consider hearers and addressees separately, as dis- 
cussed above. Hearers of a spoken utterance can be computed by properties such 
as volume-level of speech, ambient noise, and distance and perceptual abilities 
of other agents. For agent messages delivered through a router, or other network 
channels, it may be possible to specify the exact set of receivers of the message. 

For calculating the addressee(s) of an utterance several types of informa- 
tion can be used. First, the speaker may directly indicate the addressee using 
a vocative expression (e.g., calling by name or role). One may also use infor- 
mation included in the content of an utterance, if, e.g., it would be clear that 
that content would only be addressed to a specific individual. Context is also an 
important clue - e.g., who had previously spoken or been addressed. If multi- 
modal information is available this can also play an important clue: e.g., gaze or 
body orientation at a particular individual. Likewise, attention getting or deictic 
gestures are also clues. If one is the only observable hearer, that can also be a 
reason to assume the hearer is the addressee. The algorithm used for computing 
addressees in the MRE project is shown in Figure 1. 

2.4 Other Participant Roles 

In addition to the conversational roles, there are also specific task roles, relating 
participants to tasks in a variety of ways. In two-party dialogue, typically agents 
are either performers of a task or those who desire the task to be done, although 
more complex relationships are possible. For multi-party team situations, such 
as those in MRE, more complex models are required to support negotiation and 
team action [4]. We distinguish the agent who will perform a primitive task, 
from the agent who is responsible for a complex task (this agent might perform 
all of the sub-actions, or might coordinate a team of actors). Also, some tasks 
have a authority , who can authorize the team-members to carry out the task. 
This might be different both from the responsible party, the performers of the 
primitive acts, and agents who actually desire the task to be performed. Agents 
might also be guards for a task, e.g., making sure that it is not performed. 
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Some activities involving dialogue have specific roles, each with designated 
rights and responsibilities concerning participation in dialogue. This is true even 
for two party dialogue, such as shopkeeper and buyer, or information seeker and 
information provider, however much more complex relationships are possible 
with multiple participants and roles. These can include the ability and length 
and content of turns, right to assign turns, right to set and change the topic 
of the conversation. Courtroom dialogue is a striking case with many distinct 
roles, such as judge, clerk, prosecutor, defense counsel, and witness [8]. Roles 
may be filled by a single individual, or multiple individuals may fulfil the same 
role. Likewise, a single individual may play multiple roles. 

There are also social roles that go beyond a single activity, but structure 
multiple interactions and tasks. Two types of social roles include status roles 
(e.g., superior, subordinate, equal, incomparable), and closeness (e.g., friend, 
comrade, colleague, acquaintance, stranger, opponent, antagonist). These roles 
will influence the kinds of interaction allowed (e.g., only a superior may give 
an order to a subordinate), to how likely one will be to adopt the attitudes 
of another, or comply with their perceived desires. There are also institutional 
roles, such as office in a company, or military rank, defined by the institution. 

3 Interaction Management 

There are a number of aspects of managing the flow of communication, including 
the issues of who speaks when, what is the topic under discussion (and how it 
shifts), and what communicative channels are used (for which topics). Each 
of these are research topics even for two-party conversation, but become more 
complex with multiple agents. 

3.1 Turn Management 

There has been a fair amount of work on turn-taking even for two-party dialogue. 
The basic questions are when to speak and when to stop speaking. Older dia- 
logue systems generally force rigid turn-taking, where one party must wait until 
the other finishes before speaking. Many more recent systems allow “barge-in”, 
where a human who already understands a system query may provide the answer 
before the system has finished the utterance. Other systems allow interruptions 
by both parties, to correct or initiate something new, as well as to respond to 
the current utterance. Speakers can give verbal and non-verbal signals of con- 
tinuation or imminent termination of the turn. Speakers use prosody, sentence 
structure, filled pauses (e.g., “uhhh”), as well as gaze and gesture. Turn-taking 
can be modelled using these cues as well as timing information to recognize 
turn-taking acts [9] such as take-turn release-turn and keep-turn. 

In multi-party dialogue turn-taking is more complex, since more agents are 
available to potentially take the turn. As well as simply more agents competing 
for the turn, more actions are possible, e.g., assigning the turn to a particular 
next speaker vs just releasing it to whoever wants to speak next. Likewise, one 
may need to request the turn in order to be able to take it, especially if one is 
not already an active participant. 
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3.2 Channel Management 

In uni-modal communication systems, such as simple telephone speech systems, 
channel management is very similar to turn-management, though differences 
may arise if the communication channel enforces a single communicator at a 
time (as with half-duplex circuits, or chat systems which allow only one person 
to type at a time). In multi-channel systems, however, there is an additional 
issue of which channel to use for which content, as well as the timing of the con- 
tributions. Channels can be using the same modality (e.g., a radio with different 
frequencies, or a chat system with different chat rooms or different communi- 
cation commands), or different modalities, e.g., in the MRE system, agents can 
use verbal communication for face to face or radio communications, and can also 
use gaze and gesture in the visual mode for face to face communications. One 
could thus use the speech channel as the main communicative mode, while using 
the visual mode for backchannels, indicating attention and understanding. 

For multi-party dialogue, one can simultaneously have multiple “main-chan- 
nels”, e.g., one per topic, one per conversation, or one per set of participants. 
Thus, one may have simultaneous communication that is not interruption, be- 
cause of occurring on different channels between different participants. 

3.3 Thread/ Conversation Management 

Turn and channel management concern when and where communication take 
place. Thread management concerns what is being communicated, specifically 
which topics are discussed when, and how to organize the progression of topics. 
Traditional models follow a stack-based topic organization [10], in which one can 
have hierarchical organization of topics, but not parallel topics under discussion 
at the same time - when one goes back to a previous topic, one should “pop” 
the current topic from the stack. Even for two-party conversation, this may be 
too restrictive [11], especially when multiple channels can be used (e.g., many 
chat systems, in which two people can type simultaneously without seeing the 
text until one hits return, and topics often proceed in pairs). With multiple 
participants, it is also much easier to keep multiple topics open, with different 
sets of participants. 

Another issue is that of multiple conversations. Most current dialogue systems 
are concerned with only a single conversation with a single user. In contrast, 
many tasks require different periods of communication separated by periods of 
task performance or maintenance in which no communication is required. While 
some of the information that is conveyed during a prior communication episode 
is maintained by the participants, often the specific dialogue structure such as 
the turn and topic structure is not preserved. While it maybe be best to model 
separate conversations even for extended two-party dialogue, it is essential for 
multi-party dialogue, where multiple groups of participants communicate with 
different groups, using different media, about different topics. Having multiple 
conversation models allows each one to have its own structure, which can be 
simple and independent of the structure of other conversations that might be 
going on at the same time. For example, in the MRE Bosnia domain, there is 
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usually a main conversation between Lieutenant, Sergeant and sometimes medic, 
and subordinate conversations between the Lieutenant and other units over the 
radio, and between the sergeant and troop members on specific tasks. Each 
conversation has its own starting, body, and ending phases, as well as participant 
roles. In some circumstances, especially when multiple participants are part of a 
conversation, participants can dynamically enter and leave a conversation while 
it is ongoing. In more complex situations, such as cocktail party conversation, 
conversations can also split and merge dynamically. 

Sometimes multiple conversations are not completely independent. This oc- 
curs especially when they share a participant, so that different conversations 
must compete for attention of the participant. Sometimes topics are linked as 
well. One conversation might be dependent on another, E.g., if agent A asks 
agent B a question in conversation to, and then B must query agent C in con- 
versation n in order to reply to A. In this case conversation to is dependent 
on conversation n, at least for that content. Sometimes conversations are not 
dependent, but influenced by another. E.g., when participants overhear another 
conversation and take up the same topic (or comment on the other conversation 
in some way). 

When multiple threads are going on at the same time, it can be tricky to 
determine which thread a particular utterance belongs to. For the two-party, sin- 
gle conversation case, one can usually rely on topical coherence and cue phrases 
to determine whether the current utterance continues an existing thread, ends a 
thread, or begins a new one (and at which level of structure). With multiple par- 
ticipants and multiple conversations which may share participants, the problem 
becomes more difficult. One can use a number of relationships to try to match 
the utterance to the proper conversation. There may be a connection between a 
conversation and a channel, in that case observing the utterance on that channel 
may help determine the conversation. Likewise, there is a relationship between 
the addressee and the conversation. As in Figure 1, where knowledge of the con- 
versation was used to help predict the addressee, knowledge of the addressee 
can point to a conversation containing that addressee as a participant. There is 
also a relationship between topics and conversations. Identifying the topic of an 
utterance may help determine which conversation it belongs to, and vice versa. 

3.4 Initiative Management 

Initiative (or control) [12-16], concerns which agent is currently setting the 
agenda for topics of discussion. If one agent has the initiative, then another 
agent does take turns, but only to react to what was said, not to start new 
topics. Two-party dialogue systems are traditionally either user-initiative (such 
as question answering systems, where a user may pose a query, and the system 
consults a database and provides an answer) or system-initiative, in which the 
system asks a series of queries to specify the parameters for a service request. 
More recently, mixed-initiative systems allow user and system to both take the 
initiative at different points. E.g., system can take the initiative when there are 
problems in communication, to direct toward possible solutions, and human can 
take control to more efficiently provide known information. 
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In multi-party dialogues, initiative is less symmetric than two party dialogues 
for equivalent tasks [17]. Thus, the more participants in a conversation, the 
less likely it will be that each participant has an equal amount of initiative. 
Team leaders tend to develop, either formally, or informally, who structure the 
interaction. Other kinds of initiative are also possible, e.g., cross-initiative, where 
a responder does not take initiative herself, but redirects it to a third party (who 
might not even have been active), or in which a third party interjects. There are 
also issues of cross-conversation initiative, e.g. in the case of one conversation 
being dependent on another, the initiative-holder of one conversation is really 
taking direction from someone else in another conversation. 

3.5 Attention Management 

Attention is mostly assumed to be always present for most single-user, single- 
system dialogue systems. Even when attention is explicitly modelled, it is usually 
a binary decision of either being on the conversation and other participant, or 
elsewhere. In multi-party, multi-conversation situations, however, a much more 
detailed model of attention is required. An attention model can be used to sum- 
mon others into a new or existing conversation, and can model which conversa- 
tion each participant is attending to. 

4 Grounding and Obligations 

Much of the local content of dialogue can be modelled using notions like obli- 
gations and grounding [9,18-24]. These models become more complex when 
considering the multiparty case. 

Grounding is the process of adding to the common ground between partic- 
ipants in conversation [24]. The grounding model in [9,19,25] consisted of a 
structure of Common ground units , (CGUs) each of which contains material 
that is added to the common ground together. Each CGU has a unique initia- 
tor, responder, contents and state. The state is calculated using a finite state 
automaton, updated by grounding acts performed on the CGU. States include 
those in which the contents are grounded and ungroundable, as well as interme- 
diate states in which an acknowledgement or repair is needed from one party or 
another. By recognizing grounding units and the CGUs that they construct and 
add to, a computational agent is able to model and participate in the grounding 
process. 

In the MRE project, this model has been used in multiparty conversation, 
but only in cases in which there is a single initiator and responder of a particular 
CGU. For the more general case, in which there are multiple addressees, it is less 
clear what the proper grounding model should be. One option is to allow any of 
the addressees to acknowledge for the contents to be considered grounded. The 
problem is that this may lead to overly optimistic [26] estimations of common 
ground, where some agents did not in fact understand or possibly receive the 
communications. The pessimistic extreme is to require evidence of understanding 
from each addressee. While this is safer, it seems somewhat unrealistic when 
many of the addressees are human. Some sort of middle-ground is also possible, 
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requiring an amount of evidence that is more than a single acknowledgement 
from one agent, but less than a separate acknowledgement from each agent. 

Another interesting issue is grounding across conversations. E.g., if A asks 
B a question and observes B asking the same question to C (whether in the 
same conversation or a different one), A has evidence that B has understood the 
question, even though B has not yet responded to A. 

Multiple addressees also present a challenge for models of obligation. The 
model of discourse obligations presented in [19-21] takes one of the main effects 
of utterances like requests and questions to be an obligation to perform some 
action such as addressing the request (by performing the requested action, ac- 
cepting or rejecting the request, or other negotiating or explaining move). When 
there are multiple addressees, however, it is not so clear what the status of these 
obligations are. Does every addressee have a personal obligation? Is there an 
indefinite obligation assigned to the group, that can be satisfied by any member 
performing an obligation-relieving action? In the case of this indefinite obliga- 
tion, what is it that motivates any particular agent to act? 

Also there is the issue of transfer of obligation. To take the example given 
above, where B redirects A’s question to C, if this is done in the presence of A, 
does B still have the obligation? Whether or not B still holds the obligation, does 
C”s response in A’s presence relieve B of this obligation? Can another party, say 
D relieve the obligation by providing an answer even when not addressed? The 
answers to some of these questions depend on the particular type of activity. For 
instance, if the purpose of A’s question is to solicit information, and C or D are 
trustworthy, probably no more action is required of B. On the other hand, if it 
is a classroom situation, where A is asking the question not so much to find out 
the answer, but to determine whether B knows it, then B's redirect to C and 
D ' s spontaneous reply would be out of place, and perhaps subject to sanctions. 

In some cases, multi-party dialogue can actually make the theoretical models 
of dialogue clearer rather than obscuring them. A case in point is an account of 
what motivates agents to answer questions. As described above, one model that 
has been used in some dialogue systems takes obligations as the motivation; 
the systems are designed to track obligations and then use these to motivate 
performing answers. An alternate model has been to use dialogue structural 
considerations, such as Questions Under Discussion (QUD), based on work by 
Ginzburg [27] to model question answering. When a question is asked, it gets 
added to the QUD, which in turn licenses answers to the question (including 
elliptical short answers). Both approaches were used in the TRINDI project [28, 
29]. The GoDiS system [30] uses a QUD structure, while the EDIS system [31], 
uses the obligation approach. For simple two-party information-seeking domains 
such as Autoroute [29] , there is little to choose between these two accounts. Both 
do an adequate job of representing questions, answers, intermediate states, and 
observation of lack of answers or other responses. 

However we can see that there are really some distinct functions, as pointed 
out in [32]. QUD represents information about what would count as an answer, 
while obligations represent who should/must answer. Both reflect on the ques- 
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tion of when the answer should occur. Obligations may specify time-limits on 
the answer. QUD, on the other hand will allow one to track when a particular 
utterance could be understood as an answer to that question. E.g., if an in- 
tervening question of a similar type is asked after the original question, a new 
utterance may be taken as an answer to the second rather than first question. In 
the MRE dialogue model, we represent both QUD and obligations. The former 
is part of the conversation structure of a specific conversation, while the latter 
(if grounded) is a property of the social state between agents. Thus an obligation 
might be introduced by a question in one conversation, and relieved in another 
conversation. The form of the answer depends on the QUD structure, however. 
If a question is not on QUD in the current conversation, then the question must 
be reintroduced before answering, or at least the answer must be given with 
sufficient clarity to accommodate the question [33]. 

5 Conclusions 

In this article we have examined a number of issues in dialogue management for 
how they scale when moving from a two-participant model to a multi-participant 
model. Two obvious choices are available for multi-party models. One is to treat 
multiparty conversation as a set of pairs of two-party conversations. While this 
has the advantage of simplicity and using existing models, it is less than satis- 
factory in some cases. In the worst case, one will still need to move beyond the 
two party case in order to arbitrate between the multiple interactions, e.g. A 
with B and A with C. In some cases this will be more complex than changing 
the model to allow multiple participants. In some cases, we can see two-party 
dialogue as a special simple case of multiparty dialogue. 

Dialogue system evaluation is also a difficult subject even for two-party dia- 
logue. There are no universally agreed on metrics, due in large part to the very 
different types of tasks that dialogue systems are used for. Still, there are some 
general themes for evaluation, including task success, naturalness of interaction, 
user satisfaction, and efficiency. Some of these can be applied to the multi-party 
case, but the metrics become more difficult to calculate. E.g., for efficiency does 
one count real-time, or total agent time? One might count only a human’s time, 
but what if there are multiple humans? Similar issues exist for other issues - how 
does one count naturalness when some agents communicate fairly naturally but 
others don’t? 

We are as yet only in the beginning stages of modelling multi-party dialogue, 
with few applications and very few implemented systems. The requirements will 
surely increase, however, as more societies of agents and people interact in more 
fluid ways. 
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Abstract. In many situations conversations involve more than two par- 
ties. However, most research on communication modelling in e.g. multi- 
agent systems limits itself to conversations between two parties at a 
time. Very little research has been done yet on modelling multi-party di- 
alogues. In this paper we first explore the differences between two party 
and multi-party dialogues and we indicate a number of issues that arise 
when considering dialogues between more than two parties. Then we 
take some steps towards creating a testbed in which these issues can be 
explored and theory on multi-party dialogues can be developed. 



1 Introduction 

In the past few years quite some research has been done in the area of agent 
communication (see e.g. [4, 3]). In most of the reported work the dialogues only 
involve two parties at the time. Even though specifications like the FIPA ACL 
[6] permit a message to be sent to more than one addressee, the protocols they 
give are mainly based on two party dialogues. If more parties are involved this 
is mostly through a broadcast message from one party to all other parties, after 
which each of these parties reply to the original sender (cf. for instance the Con- 
tract Net protocol). A good example of this type of dialogues where more than 
two agents are present, but conversation is always bi-lateral is [18] in which dia- 
logues for purchase negotiations are modelled. Actually, we do not consider this 
to be multi-party dialogues but rather a number of parallel two-party dialogues. 

The fact that most prior work in argumentation has focused on 2-party di- 
alogues can be explained by the fact that philosophy of argumentation has, for 
2300 years, focused mostly on persuasion dialogues, perhaps because philoso- 
phers were seeking after truth. Further, the multi-agent negotiation literature 
has focused on 2-party commercial transactions, perhaps due to the pernicious 
effects of economic theory (which studies only simplified models of reality, not 
realistic models) of our field. 

The fact that agent communication also focussed on two-party dialogues, in 
our opinion, stems from the fact that much of the work done on agent conversa- 
tions, such as [17, 19] draws from the work on dialogue theory in linguistics (e.g. 
[27] is quite influential in this area). The theory gives a typology and theory for 
the moves that can be made in the different types of dialogues. However, this 
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theory is built for two-party dialogues and there is little or no (usable) research 
done on multi-party dialogues. Neither in the field of pure linguistics nor in the 
field of distributed AI did we find much leads to use for constructing theories 
of multi-party dialogues for agents. The only work we have been able to find 
treating multi-party dialogues is from Traum [5] which mainly deals with focus 
of attention and initiative in multi-party conversations. 

In practice the need for multi-party dialogues becomes more and more ap- 
parent. For instance, several agents have to cooperate in order to find a solution 
for a problem. Each of the agents might have a part of the solution, but only 
their interaction might reveal how to combine all the pieces of the puzzle. This 
might happen when a user agent tries to compose a holiday to a certain town 
and needs a flight and a hotel and maybe some local entertainment. If each of 
these components is delivered by a different source maintained by a different 
agent, it might be beneficial if hotel and flight manager can negotiate dates on 
which both flights and hotel are available. The user agent could listen in and 
intervene whenever the dates drift of too much from the preferred dates or the 
prices get too high. This would be more efficient than the user agent trying to 
first reserve a flight and afterwards a hotel (or the other way around) and having 
to go back and forth whenever the hotel is not available on the days between 
the flights. 

In this paper we will explore the field of multi-party dialogues. First, in 
section 2, we discuss a number of the issues that arise when changing from a 
two-party situation to a multi-party situation. So, it does not contain a list of 
all dialogue issues, because many issues exist for both two-party as well as for 
multi-party dialogues (e.g. whether parties are cooperative or not or are sincere 
or not). After this exploration of issues, we will present a first implementation of 
a kind of test-bed in which a number of these issues can be examined in section 
3 and 4. We only show the most simple case of an inquiry dialogue. The value 
of the example shown in section 5 is mainly as a first step of a systematic way 
of exploring the influence that design choices have on the resulting dialogues. 
Whereas the proof of interesting properties of two-party dialogues is already 
quite difficult (see [17]) it will be even harder in multi-party dialogues. The 
implementation presented in this paper can be seen as the start of a testbed in 
which such properties can be empirically explored and theory can be developed 
towards making the right choices for dialogues depending on the environment of 
the system. 

2 Issues in Multi-party Dialogues 

Probably one of the few things that does not change when moving from two to 
multiple parties is the typology of the dialogues. The well-known typology of 
Walton and Krabbe [27] is based on the goal of the dialogue (synchronization of 
believes and/or actions) and the starting situation (conflicting believes and/or 
intentions). Both elements play an equal role in multi-party dialogues. 
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2.1 Open vs. Closed Systems 

The first difference that comes up right away, is whether the system is open 
or closed. I.e. are all parties present during the whole dialogue or can they join 
later and/or leave before the end of the dialogue? When we have only two parties 
there is no choice, both parties are needed to keep the dialogue going. However, 
when more parties are involved this is no longer necessary. In news groups it 
probably is more common for parties to join and leave during a dialogue than 
for parties to stick around during the complete dialogue. Also in situations where 
the parties consult experts during the dialogue to explain issues or arbitrate on 
conflicts, the expert is only part of a small slice of the dialogue, although in some 
dialogues, e.g., legal proceedings, experts and expert witnesses may be excluded 
by the rules of the interaction from participating in the entire dialogue. There 
are presumably good reasons for this in legal theory, e.g., so that the expert’s 
testimony is not biased by what has occurred before in the courtroom. 

This brings up the first consequence. One has to distinguish between situa- 
tions in which the participation in the dialogue is formally arranged and situ- 
ations where participation is left up to the individual agents. In the first case 
the entrance and exit of a participant in the dialogue is arranged and can only 
take place at certain points during the dialogue. When the participation is com- 
pletely left to individual participants the question becomes how we know that 
a party has left the dialogue or joined it? Are there special speech acts to de- 
note these acts, do participants register? Often entrance and exit are marked by 
special messages (e.g. a register or “hello” message and a de-register or “bye” 
message)(For speech acts see [21, 22]). However, if asynchronous communication 
is used it might happen that a question is directed to an agent that just exited 
the dialogue. A mechanism should be devised to both detect this situation (pre- 
venting agents to wait indefinitely for an answer) and a way to recover from it. 

Of course open settings also make it difficult to check whether, for instance, 
the goal of a dialogue has been reached. Is there general agreement on a course of 
action, is there mutual believe, is everyone convinced? For each of these questions 
one should specify which parties (still) count. 

2.2 Roles 

A following issue is the role of each of the parties in the dialogue. This can be 
viewed from different perspectives. In the first perspective we look at roles from 
a linguistic point of view. In a two-party dialogue there is always a speaker and 
an addressee (or hearer). However, in a multi-party dialogue we can at least 
distinguish: speaker, addressee, auditor, overlrearer and eavesdropper (see e.g. 
[14]). Although the speaker directs a message to one (or more) other parties there 
can also be parties that hear the message without being addressed explicitly. The 
auditor is a party that is supposed to hear the message (this is intended by the 
speaker). An example is when I cc a mail to my boss to “proof’ that I indeed 
made an inquiry as I had promised to my boss before. The overhearer is allowed 
to hear the message but the speaker does not intend him to hear it. This happens 
when I send a message as a response to a question in a news group. All people 
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subscribed to that news group can see the response but are not necessarily the 
intended audience. Finally, the eavesdropper is a party that happens to hear the 
message without the speaker wanting him to hear it at all. E.g. I do a “reply-all” 
while I intended a “reply” to a mail. When determining the effect of a message 
on the other parties one has to take the role of that party into account. A request 
to perform an action might lead to an intention in the addressee while leading 
to a dropping of the same intention in the overlrearer. 

One can also look at the roles from a dialectic perspective. In a typical two- 
party persuasive dialogue there is a proponent and an opponent. In two-party 
dialogues of the inquiry or deliberation-type (in the Walton and Krabbe typol- 
ogy), the pro/con distinction already blurs. Additionally, in a multi-party dia- 
logue we can also have roles such as: neutral party, interested party, interviewer, 
advocate, respondent, examinator, challenged party, mediator, or arbitrator (to 
coin a few disciplinary-neutral terms). The role an agent plays influences the 
type of responses it can give during the dialogue. E.g. a mediator might give an 
alternative proposal or might ask for additional information from other parties 
after the first arguments have been exchanged. 

A third type of roles that can be distinguished are the social roles within 
the dialogue. A good example is that of a chairperson. These types of roles 
can influence the turn taking within the dialogue, but can also determine when 
parties can join the dialogue and leave it again. Finally, the chairperson might 
have the power to terminate a dialogue one-sided or through some predetermined 
protocol. 

A last perspective on roles that we mention here are interests. Some parties in 
a dialogue will have the goal to terminate the dialogue successfully, while other 
parties might want just to disrupt the dialogue or try to extend it eternally. This 
might happen in multi-party negotiations. Some parties might want to conclude 
the negotiations, trying to get the optimum result for themselves, while other 
parties benefit most when no agreement is reached at all. 

For each of the perspectives on roles one can choose whether roles are fixed 
once or can change during the dialogue. For linguistic roles this seems the most 
likely choice, but also other roles might change (either explicitly or implicitly) 
during a dialogue. 

2.3 Medium and Addressing 

This issue ties in with that of linguistic roles. The main question is how mes- 
sages are addressed. We can make a distinction between one-to-one distribution, 
one-to-many distribution and one-to-all distribution. In a two-party dialogue all 
messages are directed at the “other” party. However, in a multi-party dialogue 
one can choose whether to address a message to a specific other party or to sev- 
eral (specified) other parties or just broadcast the message to all other parties. 

Considering especially open dialogues it is also interesting to know whether 
the messages are observable throughout the dialogue or only when they are sent. 
In the first case, latecomers can check the messages that are exchanged so-far, 
while in the second case they just miss the start of the dialogue. 
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Independently one can decide whether all communication is “overheard” by 
all parties or not. I.e. can all parties hear all communication or only the messages 
that are directed to them? One might even make distinction between observing 
that a message is exchanged and observing the content of the message. I.e. one 
might see two persons whispering to each other without hearing the content. One 
might argue that messages that are only heard by a subgroup are a separate 
dialogue between that subgroup. In that case all messages should at least be 
heard by all participants of the dialogue. 

2.4 Coordination 

The first question on the issue of coordination is whether the parties can all 
react asynchronously or whether each time only one party can have its turn. 
Although the asynchronous coordination may seem chaotic, it is the standard 
for most news groups and mailing lists. 

The issue of turn-taking is relatively trivial in two-party dialogues. However, how 
is the order determined in which the parties take turns in a multi-party dialogue? 
Is it a round-robbin protocol (the generalization of the strict turn-taking for two 
parties) or do we use other mechanisms? As said above, one might also have a 
chairman that explicitly determines the next party that can or should take its 
turn. 

Independently one should decide on which messages each party can react. Can 
parties react on all messages delivered before, the last messages of all parties 
delivered before, only to the originator, etc.? Can they react to a message of 
which they were not the addressee? 

Of course, one should also consider how all parties can react. A common 
rule in many two-party argumentation systems is that parties are not allowed 
to repeat an argument in order to avoid infinite regression of arguments (see 
e.g. [24]). In multi-party arguments the question arises whether other parties are 
allowed to repeat an argument. One could argue that this indicates additional 
support for an argument and it would be useful to strengthen an argument. 

2.5 Termination 

Although we argued above that the goals of the multi-party dialogues are sim- 
ilar to the two-party dialogues they have to be translated to their multi-party 
versions. For instance, does a persuasion dialogue end successful if all parties 
are convinced of an argument, or if most are convinced of it, or if some desig- 
nated parties are convinced? One might say that the original addressees should 
be convinced while the auditors and overhearers do not have to be convinced. 
But many more choices are possible. 

In the same vein the termination of the dialogue becomes less obvious. Who 
determines whether a dialogue is terminated? In case there is a chairman he 
might decide. However, it might also be that a majority of the parties should 
agree that the dialogue is terminated. Or the party that started the dialogue can 
explicitly end it. 
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Each of these choices influences not only the point of termination but also the 
strategy and maybe the rules of the dialogue game. In case a majority of the 
participants has to agree to terminate the dialogue, a party can try to form 
a coalition with enough other parties to “win” the dialogue. In case only the 
originator has to be convinced all arguments will be directed to this party. 

2.6 Properties of the Dialogue 

There are some interesting new properties to be checked for multi-party dia- 
logues. E.g. can we guarantee that an inquiry dialogue protocol will deliver the 
answer if the union of all the knowledge of the parties in the dialogue would be 
enough to derive this answer? 

Other issues would involve whether a protocol only reveals information to par- 
ticipants that they need to know in order to respond or whether information is 
released that parties would rather not divulge if not needed. 

A typical issue coming up in human conversations is whether all parties have an 
equal opportunity to reach their part of the goal of the dialogue. I.e. can they 
put forward all their arguments at the right time to convince some party or the 
whole group? 

Of course the question of guaranteed termination of a protocol is also very rele- 
vant for a multi-party dialogue. 

2.7 Internal Operation of the Agents 

Besides the issues described above on the external properties of a multi-party 
dialogue one might also want to consider whether the parties should have some 
extra internal properties to take part in a multi-party dialogue. One obvious 
candidate is how agents determine when and with what content to respond to 
which other parties. 

2.8 Conclusion 

The issues discussed in this section probably do not cover the whole field, but 
hopefully give a good insight in the landscape of multi-party dialogues and its 
challenges. Far from trying to answer all of the above questions in the rest of 
this paper we will sketch a framework in which these questions can be studied 
and possible answers formulated in an empirical way. 

Our hope is that by working on a systematic implementation where all possible 
choices are made explicit and adjustable we will in the end create a test-bed that 
reveals all issues and can be used to test which are the consequences for certain 
combinations of features. 

In order to arrive at this test-bed that should register all choices, we will start 
with a very simple implementation that only contains the core of multi-party 
dialogues that should be shared by all forms. 
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3 A Starting System: Blackboard Dialectic 

As said before the basis of agent conversations can be found in linguistics and 
one of the most relevant areas is that of computational dialectics. So, we will 
take the work from this field as a basis for our own experimental work and dis- 
cuss the methods of computational dialectic [1990-now] in the next section. 
Because multi-party dialogues require different methods of communication than 
only the one-to-one and one-to-many we decided to use the blackboard system 
metaphor as a basis for communication. Blackboard systems are probably the 
most generic form of multi-party communication. More sophisticated communi- 
cation forms can easily be build on top of these systems. We discuss the methods 
of blackboard systems [1975-now] in the second subsection below. 

Computational dialectic. Influenced by work on nonmonotonic reasoning and 
argumentation theory, computational dialectic emerged as an area that makes 
significant progress with the (formal) synthesis and (computational) execution 
of artificial argument and dispute [12,26, 1]. As far as we know, however, most 
if not all work in computational dialectic models disputes in which two parties, 
named PRO and CON, exchange arguments in such manner that each party 
is obliged to immediately respond to moves of its (or his, or her) opponent. In 
his pioneering paper “process and policy,” Loui termed this type of dispute as 
so-called two-party immediate response dialectic [13]. An example of a two-party 
immediate response dispute, adapted from [25], is the following example 

Example. Two parties named PRO and CON, have arguments of the most simple 
sort for and against A, respectively. 

Proposition DOB Proposition DOB 

B 1.00 B-( 0.97)-* A 1.00 

C 1.00 C -(0-91H -A 1.00 

Thus, both believe all four propositions B , C, B -(0.97)->' A , and C -(OAl)-^ ->A 
to be absolutely true. This certainty is expressed as a degree of belief (DOB), of 
1.00. The implicational strength with which the two rules imply their conclusion 
is 0.97 and 0.91, respectively. Accordingly, the dispute evolves as in Table 1. 

The first row of numbers in Table 1 are line numbers. Of the second row of 
numbers (separated by a comma) the first row indicates the level of the dispute, 
i.e., the number of times that the burden of proof has alternated. For example, 
if PRO starts a dispute on A and, within this dispute, CON starts a dispute on B 
and, within con’s dispute, PRO starts a dispute on C, then the dispute is at level 
three. The second half of the row of numbers that are separated by a comma 
indicates the depth of the dispute, i.e., the number of times that the defending 
party justifies his claim via regression through rules. 

The point of this sample-dispute is to note that a two-party immediate re- 
sponse protocol forces both parties to enter and explore every part of the search 
space in a strict depth- first search fashion, something that is not akin to in- 
formal dispute. Still, the two-person dispute system presented in [25] provides a 
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Table 1. Output of a simple two-person immediate response dialogue. 



1 . 


0,0 


pro: 


2. 


1,0 


con: 


3. 


0,0 


pro: 


4. 


1,1 


con: 


5. 


0,1 


pro: 


6. 


1,1 


con: 


7. 


2,1 


pro: 


8. 


1,1 


con: 


9. 


0,1 


pro: 


10. 


1,0 


con: 


11. 


2,0 


pro: 


12. 


1,0 


con: 


13. 


2,1 


pro: 


14. 


1,1 


con: 


15. 


2,1 


pro: 


16. 


3,1 


con: 


17. 


2,1 


pro: 


18. 


1,1 


con: 


19. 


0,0 


pro: 



I claim that A holds. 

Why? 

Because it follows from B with strength 0.97, and I claim that B 
holds. 

Why? 

Simply because B is the case with DOB = 1.00 
Frankly, I am willing to contest B: I claim that -i B holds. 

Why? 

Um... upon closer inspection I see I have no grounds for -iB. 

That leaves me with B , obviously. And via B my A is supported 
0.97 

Sure, but now this: I claim that -iA holds. 

Why? 

Because it follows from C with strength 0.91, and I claim that C 
holds. 

Why? 

Simply because C is the case with DOB = 1.00 
Frankly, I am willing to contest C: I claim that -i C holds. 

Why? 

Um... upon closer inspection I see I have no grounds for -i C. 

That leaves me with C, obviously. And via C my -<A is supported 
0.91. Nevertheless, I grant that earlier on (line 9), you were able to 
support A more strongly. Therefore I drop ~>A. 

Combining my support (line 1-9) and your counter-support (line 
10-18) yields 0.97 for A, which means that I am right on A. 



number of clues to multi-party dialogue. Therefore, the example implementation 
presented in this paper continues to build on the system presented in [25]. 

Blackboard systems. Another development relevant to multi-agent dispute are so- 
called blackboard systems [2, 7, 8]. We see blackboard systems as the most basic 
form of communication medium. Many other systems can be seen as refinements 
of this system. A blackboard system is an expert system based on the blackboard 
metaphor. This metaphor says that, to solve a problem, or to answer a question, 
a number of more or less independent artificial experts, named knowledge sources 
in BB-terminology, should communicate by means of a central medium, the 
blackboard from which all knowledge sources may read and write. Of course, 
there must be some protocol to arrange who goes to the blackboard if more than 
one expert “reaches for the chalk”, and there must be agreement on a common 
language that experts use, but these questions are clearly addressed in research 
on blackboard systems. In the initial stages of research on knowledge based 
systems, the motivation to work with blackboard was that such systems are less 
compiled and more modular than conventional expert systems. The idea was 
that they should be more transparent and easier to maintain than monolithic 
expert systems. After some time, however, research on blackboard systems lost 
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focus, because it was claimed that they were not fast enough to meet the DARPA 
criteria at that time [10, p. 350]. (An argument that has now become obsolete.) 

In 2003 it is fair to state that research on multi-agent systems has turned 
the blackboard metaphor into something that does not live up to contempo- 
rary standards of agent-communication and geographically distributed expertise. 
However, research on blackboard systems can be put into new light by elabo- 
rating on the idea of a newsgroup expert discussion. For example, if a computer 
programmer has a question on feature X of the newest release of programming 
language Y, he or she can post a question to the appropriate newsgroup and may 
expect an answer within a time range of one to twenty-four hours (depending 
on the turnover of a particular group). More specifically we propose to convey 
the idea of cooperating agents from internet metaphor such as the newsgroup 
metaphor or, in Google terminology the discussion groups metaphor, to a simple 
but new protocol for the exchange of knowledge. Our motivation for this step is 
that news and discussion groups are the de facto standards to let human agents 
communicate asynchronously in public. The example implementation presented 
in this paper adopts the blackboard metaphor (rather than adopting its precise 
architecture) and gives it a dialectic twist. 

4 Implementation 

In this and the following section we describe the implementation of a dialectic 
blackboard architecture implemented in Ruby. Ruby is a kind of crossbreed 
between Perl and Smalltalk created around 1993 by Yukihiro Matsumoto [15]. 
We choose for Ruby because it is a pure and sober object-oriented scripting 
language with an intuitive syntax, extremely suited for prototyping. (Some even 
advocated Ruby as executable pseudo code.) De-facto authority is 
http : / /www . ruby-lang . org/ 1 . De-facto reference is what afficionado’s call “the 
pickaxe book” [23] . We hope to be able to implement the dialogues on the 3APL 
platform in the coming year (see [9] for a basic description of 3APL). This will 
allow us to use “real” agents to implement the knowledge of each agent and also 
the conversation rules that the agents use can easily and explicitly be represented 
in 3APL. A second advantage is that the platform is readily available for us and 
we can change the platform structure to suit our dialogue needs. 

Based on two-party disputes in computational dialectic on the one hand, 
and based on the blackboard metaphor in expert systems on the other hand, 
we start with a dialectic blackboard architecture with the following multi-party 
properties: 

i. A fixed number of equivalent participants engage in an inquiry dialogue, 
with the goal to discover how to undertake a particular action. So, in terms 
of section 2.1 we start with a closed system. We start with inquiry dialogues, 
because they are the easiest with respect to regulating turn taking. 

ii. There are no specific roles for the agents. They are equivalent in all respects. 

1 A websearch with keyword “Ruby” should give enough leads. 
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iii. Agents communicate through a central medium, called the forum, the func- 
tion of which may be compared to the function of an internet newsgroup. 
Messages are public. They are not addressed to specific agents. 

iv. Agents act (listen, reason, and speak) in turn, for a fixed number of rounds. 
The current experiment, for instance, involves 3 agents and 7 rounds, which 
implies the performance of 21 atomic acts. 

v. There is no criterion for termination. Cf. point (iv). 

The following properties are not typical multi-party issues, but also determine 
the course of a dialogue. 

a. Participants are cooperative. They do not lie about their beliefs. All agents 
acknowledge and process all messages (in so far these messages are applica- 
ble), all agents have ample time to reason, and all agents have the opportu- 
nity to post all their messages desired. 

b. Agents have logical capabilities. In particular, they do not ask what they 
already know or can infer. Before asking, an agent tries to infer the desired 
item itself. 

c. The facilitation of information is dialectic: claims are justified with other 
claims or denied with reasons that support a contradiction. Agents accept 
claims if and only if they can be resolved to information that they believe 
to be true. 

d. Regression to previous messages is always possible. Agents are allowed to 
question or justify prior claims. Thus, an immediate response is not required. 

e. For simplicities’ sake the agents have a shared ontology. One consequence 
of this assumption is that propositions (internal representations of claims) 
conveyed through messages are not renamed. 

4.1 Notation 

We use the following notation as convenient shorthand in our implementation. 

Definition 1. A discussion group G = ( A,F ) is a (possibly infinite) set of 
agents A that communicate by means of a forum F. Individual agents are denoted 
by a, ; . 

A forum F = {mi | 1 < i < n} is an array of messages, where m n is the last 
message published. A message to* = (q, A) is a tuple consisting of a question q, 
possibly followed by one ore more answers to that question. The set of answers, 
A, is an ordered list. 

The internal structure of agents, questions and answers is left unspecified in the 
theory, but a possible interpretation is elaborated in the rest of this section. Each 
agent cq possesses a name, knowledge, a number of questions to be answered, a 
bookmark to remember the first unread item, bookmarks per question for the 
first unread answer of that question, and a hash to remember which questions 
have already been answered (Figure 1). 
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Oname = ’Pete’ 

^knowledge = { ’a’ => [[’f’, ’b’], [’c’, ’d’]], 
’d’ => [[’e’ , ’b’] , E’c’]] , 

>p’ => TRUE 

} 

@questions = { ’a’ => nil, # new question 

’d’ => 3, # first unread answer 

} 

# first unread article: 

Sfid = 45 # Ofid means "forum identifier" 

Sbookmark = { 4 => 2, 56 => 6 } 

^answered = { 4 => TRUE, 8 => TRUE } 

Fig. 1. Datastructure per agent. 

N => { 

’ owner ’ => ’ Claude ’ , 

’question’ => ’bake a cake’, 

’answers’ => [ { ’owner’ => ’Francois’, 
’answer’ => 

’knead the flower’ , 
’bake the paste’ } ] 

Fig. 2. Datastructure per question. 



4.2 Data Structures 

The agent’s knowledge is implemented as a hash, where each atomic action is 
mapped onto a set of alternative preconditions, or to the value TRUE. The latter 
indicates that the agent knows how to perform this action. 

The scenario consists of a (theoretically unlimited) number of agents which 
run concurrently. Furthermore, the scenario accommodates a forum that is 
shared by all agents. The forum is a passive medium, but is otherwise responsi- 
ble for the management and administration of messages and personalities (agent- 
id’s). In our current model, these bookkeeping activities amount to no more than 
maintaining a hash that maps questions onto message-id’s (a so-called reverted 
index). 

Each agent may be in a consumptive mode or in a productive mode. In the 
consumptive mode an agent takes actions that are supposed to deal with the 
accumulation of new knowledge. In this case, we have limited this part of the 
algorithm to two actions, viz. reading the forum and publishing questions to it. 
In the productive mode an agent does things that are supposed to deal with the 
dissemination of (private) knowledge. In our case, we have limited this part to 
answering questions of other agents. In the following paragraphs, we describe 
how an agent reads a forum, publishes questions, and responds to questions of 
other agents. 
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4.3 Consumptive Mode 

An agent reads news by cycling through its own (typically short) list of unan- 
swered questions. (Cf. Figure 1.) For each such question, it locates the message-id 
of this question by means of a reversed index. (Recall our assumption about the 
administrative responsibilities of the forum-object.) If the question is absent, 
the agent publishes its question and proceeds with its next (and possibly non- 
assimilating) activity 2 . Else, if the question is already present in the forum, the 
agent starts scanning the answers to that question, beginning with the first an- 
swer not read (by that agent), and ending with the last answer. The point where 
the agent left the previous time reading answers to that particular question is 
called the agent’s bookmark for that question. Previous to the first scan of a 
question, the bookmark is typically set to zero. After scanning all the answers 
to a question, the bookmark is put behind the last question. 



Agent method 1 Reading answers. 

Require: index 

1: while Obookmark [index] < public . answers (index) . length do 
2: question = public . quest ion (index) 

3: Obookmark [index] += 1 

4: answer = public. answer (index, Obookmark [index] ) 

5: incorporate_justif ication(question, answer) 

6: if elements_of_explanation_suff ice? (answer) then 

7: delete_question(question) 

8: break 

9: end if 

10: end while 



When an agent reads an answer it first incorporates the question/answer- 
combination into its knowledge base. It then verifies whether the answer really 
suffices as an answer (a process to be explained shortly). If the answer suffices, 
the agent deletes the question from its private question list and marks it as 
answered in the forum. It also stops reading further answers to this question. 
Else, if the answer does not suffice, the agent proceeds reading answers to that 
particular question. Notice that an answer is incorporated in the knowledge 
base, irrespective of whether it suffices as an answer or not. This is because an 
unsatisfactory answer may become useful in a later stage of the process if further 
knowledge becomes available. 

To verify whether a particular answer is satisfactory, an agent verifies whether 
all elements of that answer can be reduced to its own knowledge. If one such 
element cannot be resolved, the verification of the other elements is suspended. 
This is because answers to dependent issues become irrelevant when the original 
answer cannot be answered either. To verify whether an element can be reduced 

2 A published question may roughly be compared to what is called a knowledge source 
activation record (KSAR) in blackboard systems. Cf. [2]. 
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to private knowledge, the agent verifies whether it has marked the element as 
true in its own knowledge base, or whether its knowledge base contains a rule 
supporting that element. In the latter case, the algorithm recursively applies the 
verification process until the agent encounters private facts or, in the negative 
case, cannot further justify one particular element. When a particular element 
cannot be further reduced (i.e., justified) it is published as a question to the 
forum. 

The last type of action is crucial to the process as it closes the consump- 
tion/production cycle of questions and answers. 



4.4 Productive Mode 

When in productive mode an agent reads the news chronologically, starting at 
the first message that was not read by that agent. Thus every agent now possesses 
two types of bookmarks: one global bookmark for keeping track of questions, and 
one for each question to keep track of the answers to that question. Since it does 
not seem to make much sense to try to answer one’s own questions, an agent 
skips its own messages (i.e., questions) if it is in productive mode. A question is 
also skipped if an agent has given all its answers to that particular question. Else, 
it publishes an answer to that question. This answer may not already have been 
published by other agents, including itself. Answers to questions cannot (and 
need not) be bookmarked, because other agents may have contributed possibly 
identical answers in the meantime. 

After this action, the agent proceeds with its next (and possibly non-pro- 
ductive) activity. 

It turns out that the consumer-part of the algorithm is more complex than 
the producer part, which might seem to reinforce an old adage, namely, that 
listening is more difficult than speaking. 



5 A Sample Dialogue 

As an example, suppose we have three equivalent parties, Arie, Bill and Claude. 
Arie stands next to his bike. He has pulled out the front wheel out of the fork 
and holds it up. Bill and Claude are two friends of Arie and stand next to him. 
(Cf. Figure 3-5.) Bill’s neg-knowledge indicates that pulling out a wheel and 
standing there are perfect conditions to ensure that you can no longer go to 
the hardware store. With this information, the dialogue ensues that is shown in 
Table 2 at the end of this article. 

Looking at Figure 3-5), we see that the formulation of line 23-26 is somewhat 
unfortunate. This is so because the elements of the knowledge-base are actually 
action descriptions, rather than verifications of the state of an agent. Line 23 
actually indicates that the wheel has been put out, and lines 25-26 actually in- 
dicate that Arie cannot verify (or does not agree) that he looks foolish. Apart 
from this and similar linguistic glitches (that were to be expected due to inter- 
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1 . 


Arie: 


2. 


Bill: 


3. 


Claude: 


4. 


Arie: 


5. 


Arie: 


6. 


Arie: 


7. 


Arie: 


8. 


Arie: 


9. 


Arie: 


10. 


Arie: 


11. 


Arie: 


12. 


Arie: 


13. 


Arie: 


14. 


Arie: 


15. 


Arie: 


16. 


Arie: 


17. 


Bill: 


18. 


Claude: 


19. 


Arie: 


20. 


Arie: 


21. 


Arie: 


22. 


Bill: 


23. 


Arie: 


24. 


Arie: 


25. 


Arie: 


26. 


Arie: 


27. 


Arie: 


28. 


Arie: 


30. 


Arie: 


31. 


Arie: 


32. 


Arie: 



Table 2. Ruby-generated dialogue, 
asks: “How do I repair a flat tire?” 

responds: “Arie, if you want to repair a flat tire then you must 
buy a new tire and apply it” 

responds: “Arie, if you want to repair a flat tire then you must 
find the leak and seal it” 

tries to recall for himself how to buy a new tire . . 

.. he realizes he does not know how to buy a new tire 

therefore suspends his attempt to apply it 

tries to recall for himself how to find the leak .. 

realizes that he knows how to find the leak 

knows that find the leak can be done by trying to inspect the 

tire and then mark the leak 

tries to find the leak by trying to inspect the tire and then 
mark the leak 

tries to recall for himself how to inspect the tire . . 

.. he realizes he does not know how to inspect the tire 
therefore suspends his attempt to mark the leak 
therefore suspends his attempt to seal it 
asks: “How do I buy a new tire?” 
asks: “How do I inspect the tire?” 

responds: “Arie, if you want to buy a new tire then you must 
go to the hardware store” 

responds: “Arie, if you want to inspect the tire then you must 

put the tire in a bucket and watch for bubbles” 

tries to recall for himself how to go to the hardware store . . 

.. he realizes he does not know how to go to the hardware store 

asks: “How do I go to the hardware store?” 

responds: “Arie, if you want to go to the hardware store then 

do not pull out wheel from fork and stand there like a looney” 

knows how to pull out wheel from fork and does so. 

tries to recall for himself how to stand there like a looney .. 

.. he realizes he does not know how to stand there like a looney 

asks: “How do I stand there like a looney?” 

knows how to put the tire in a bucket and does so. 

knows how to watch for bubbles and does so. 

now inspect the tire by put the tire in a bucket, watch for 

bubbles first and then inspect the tire 

does mark the leak 

does seal it 



polating 3 agent-specific strings into different template utterances), we maintain 
that the dialogue otherwise exhibits a surprisingly natural flow of messages and 
message justifications, so much so, we think, that the algorithm hints at a further 
and more mature elaboration. 

3 Like Perl’s string interpolation: if $year is 1984, then "It’s $year" evaluates to 
"It’s 1984". 
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Agent .new( 

’name’ => ’Arie’ , 

’questions’ => { 

’repair a flat tire’ => TRUE 

>, 

’ knowledge ’ => { 

’pull out wheel from fork’ => TRUE, 

’stand there like a looney’ => TRUE, 

’seal it’ => TRUE, 

’find the leak’ => 

[[’inspect the tire’, ’mark the leak’]], 

’put the tire in a bucket’ => TRUE, 

’watch for bubbles’ => TRUE }) 

Fig. 3. Creation of Arie. 

Agent .new( 

’name ’ => ’Bill ’ , 

’questions’ => {}, 

’ knowledge ’ => { 

’repair a flat tire’ => 

[[’buy a new tire’, ’apply it’]], 

’buy a new tire’ => 

[[’go to the hardware store’]] }, 

’neg-knowledge ’ => { 

’go to the hardware store’ => 

[[’pull out wheel from fork’, 

’stand there like a looney’]] }) 

Fig. 4. Creation of Bill. 

6 Related Work 

First of all we should reiterate that there is not much other work done on multi- 
party dialogues. Of course it does not mean that there is no related work at 
all. Much of the work on modelling multi-party dialogues can be based on or 
be inspired by the two-party dialogue frameworks. In this section we will briefly 
discuss work of Amgoud et al. on formal inter-agent dialogues [20]. Then, we will 
discuss some other work on agent communication that is relevant for our work. 

In their study on argumentation-based dialogues between agents, Amgoud et 
al. define a set of locutions by which agents can trade arguments, a set of agent 
attitudes which relate what arguments an agent can build and what locutions 
it can make, and a set of protocols by which dialogues can be carried out. They 
then consider some properties of dialogues under the protocols, in particular 
termination and complexity, and show how these relate to the agent attitudes. 
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Agent .new( 

’name’ => ’Claude’, 

’questions’ => {} , 

’ knowledge ’ => { 

’know that the hardware store closed’ => TRUE, 
’repair a flat tire’ => 

[[’find the leak’, ’seal it’]], 

’inspect the tire’ => 

[[’put the tire in a bucket’, 

’watch for bubbles’]] }, 

’neg-knowledge ’ => {} ) 

Fig. 5. Creation of Claude. 



Compared to our work, Amgoud et al .’ s analysis is more logic-oriented. 
Rather than being focused on a prototypical implementation, it is more directed 
towards results (theorems) on the correctness of verification protocols, and the 
existence of termination conditions. The properties of the dialogue follow com- 
pletely from the properties of the agents participating in the dialogue plus some 
other (implicit) assumptions on turn taking etc. This work is certainly usable 
as a basis for proving properties of multi-party dialogues as well. It will be in- 
teresting to implement the agents as modelled logically by Amgoud et al. and 
check whether the properties do indeed hold and whether they would still hold 
when more than two parties are involved. 

Of course, much work from the area of agent communication in general is 
relevant for this work as well. See [4, 3] for some overview. Of particular inter- 
est is the work on the semantics of agent communication. The semantics of the 
communication determines in large part which messages can be send at which 
moment in time. The work of Singh [21, 22] is illustrative for this line of research 
that is geared towrds getting the speech-acts right. In [21], Singh provides a 
formal semantics for the major kinds of speech acts at a more formal level. In 
particular, Singh claims that in order for multi-agent systems to be formally and 
rigorously designed and analyzed, a semantics of speech acts that gives their ob- 
jective model-theoretic conditions of satisfaction is needed. In [22], this work is 
more elaborated and results in normative constraints for felicitous communica- 
tion. 

In [11] Kumar et al. describe group communication with agents. The work 
especially deals with the semantics of speech acts when a group of other agents 
is addressed. I.e. it distinguishes addressees and overhearers and tries to capture 
this difference in the semantics as well. This work will certainly be used when 
we extend our framework to take this type of messages into account. 

It will also be interesting to check how multi-party communication fits in 
the framework provided by the FIPA specifications [6]. These specifications are 
geared towards modelling communications between two agents, even though 
more parties can be involved in the conversation. First of all it should be ex- 
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amined whether the standard messages contain all components to model the 
dialogues between multiple parties. Secondly, one should carefully look at the 
FIPA ACL semantics to check whether they are stil sensible in the context of a 
multi-party dialogue. 

7 Conclusion 

In this paper we have scanned the landscape of multi-party dialogues. Some 
issues have been discussed that surface when changing from two-party dialogues 
to multi-party dialogues. Many issues come up that have crucial influence on 
the ensuing dialogues. In order to get some grip on this area, we started of to 
implement a multi-party dialogue architecture in which we can systematically 
research these issues. 

We emphasize again that the implemented example protocol is extremely 
simple and is not designed to compete with established and more elaborated 
protocols for the exchange of knowledge. 

What we do hope, however, is that our example has illustrated the possibility 
to set up a test-bed on the basis of this implementation in which a large number 
of parameters can be set and in which different ideas on multi-party deferred- 
respouse protocols can be tested and improved. 

What we have shown already is how two-person immediate response protocols 
can be generalized and extended to a multi-agent setting by means of a simple 
blackboard or forum metaphor. 

As next steps we have to look at the way turn taking is organized in the 
system. At the moment this is coded in the system itself and can’t be regulated 
either by the user or the agents. The next point will be to extend the dialogues to 
include persuasion dialogues. This is a big step because it involves the inclusion 
in the agents to process inconsistent information and finding counter-arguments 
for arguments. Also interesting is the issue of linguistic roles. Instead of using 
a plain blackboard structure we can make sections that are visible for different 
sets of participants. In this way we can simulate different linguistic roles. Finally, 
we hope to start looking into properties such as termination of the dialogues 
depending on these different parameters. 
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Abstract. We propose group communication for agent coordination 
within “active rooms” and other pervasive computing scenarios featur- 
ing strict real-time requirements, inherently unreliable communication, 
and a large but continuously changing set of context-aware autonomous 
systems. Messages are exchanged over multicast channels, which may re- 
mind of chat rooms in which everybody hears everything being said. The 
issues that have to be faced (e.g., changing users’ preferences and loca- 
tions; performance constraints; redundancies of sensors and actuators; 
agents on mobile devices continuously joining and leaving) require the 
ability of dynamically selecting the “best” agents for providing a service 
in a given context. Our approach is based on the idea of implicit orga- 
nization, which refers to the set of all agents willing to play a given role 
on a given channel. An implicit organization is a special form of team 
with no explicit formation phase and a single role involved. No middle 
agent is required. A set of protocols, designed for unreliable group com- 
munication, are used to negotiate a coordination policy, and for team 
coordination. Preconditions and effects of these protocols are formalized 
by means of the joint intention theory (JIT). 



1 Introduction 

So-called “active rooms” or “active environments” are pervasive computing sce- 
narios providing some form of “ambient intelligence” , i.e. some form of auto- 
matic, sophisticated assistance to humans performing physical or cognitive tasks 
by specialized devices present in the same place. Active environments often fea- 
ture a large and continuously changing set of context-sensitive, partly mobile 
autonomous systems; thus, a multi-agent architecture is a natural choice. As an 
example, our current domain of application 1 is interactive cultural information 

1 This work was supported by the PEACH and TICCA projects, funded by the Au- 
tonomous Province of Trento (Provincia Autonoma di Trento, Italy). 
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delivery within museums or archaeological sites, for which we develop multi-user, 
multi-media, multi-modal systems. Agents, which are distributed on both static 
and mobile devices, guide visitors, provide presentations, supervise crowds, and 
so on, exploiting whatever sensors and actuators are close to the users during 
their visit. The agents must immediately react to changes in the focus of atten- 
tion or to movements of the users, in order not to annoy them with irrelevant or 
unwanted information. To make things even harder, wireless networks (required 
to support mobility) are intrinsically unreliable for a number of reasons. Con- 
sequently, agents have to deal, in a timely and context-dependent way, with a 
range of problems that include unexpectedly long reaction times by cooperating 
agents, occasional message losses, and even network partitioning. 

We propose a form of group communication, called channeled multicast [1], 
as the main technique for agent coordination in ambient intelligence scenarios. 
Group communication is an active area of agent research (see [6] in this volume); 
among its advantages, it often reduces the amount of communication needed 
when more than two agents are involved in a task, and allows overhearing of the 
activity of other agents 2 . Overhearing, in turn, enables monitoring [11], group 
formation (see [15] in this volume), and the collection of contextual information; 
by overhearing, an agent can understand the state of other agents and possibly 
build models of their information needs (see [24] in this volume), leading to 
pro-active assistance [3]. 

This work describes an agent coordination technique based on group com- 
munication and provides its initial formal ground. Specifically, we define a set 
of social conventions, formalized with the Joint Intention Theory [5], for estab- 
lishing and enforcing a coordination policy among agents playing the same role. 
This approach addresses issues of redundancies and adaptation to the context 
without the intervention of middle agents. Our communication infrastructure, 
being based on IP multicast, features almost instantaneous message distribution 
on a local area network but suffers from occasional message losses. Consequently, 
protocols have been designed to work under the assumption of unreliable com- 
munication. 

This paper is organized as follows. Next section presents our reference sce- 
nario, while the following (Sec. 3) focuses on our objectives and our current 
technology. Sec. 4 introduces the concept of implicit organization, that is, a 
team of agents coordinating to play a role on a channel. Sec. 5 describes the 
interaction between agents and implicit organizations. The behavior of an orga- 
nization is formally described in Sec. 6. The following three sections discuss how 
organizations decide their own coordination policies, describe a few ones, and 
show some examples (Sections 7, 8, and 9 respectively). Sec. 10 compares some 
works available in the literature with ours. 



2 Listeners that are non explicitly addressed are classified as auditors, overhearers , 
or eavesdroppers by [6]. They differ because of the attitudes of either speakers or 
hearers, but are indistinguishable from the perspective of the communication media. 
For the sake of simplicity, we drop this distinction and call them all overhearers. 
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2 A Reference Scenario 

The PEACH project [20] aims at creating “active museums”, i.e. smart environ- 
ments featuring multimodal I/O where a visitor would be able to interact with 
her environment via various sensors and effectors, screens, handheld devices, 
etc. In this challenging application domain, agents can come and go dynami- 
cally, visitors move around the museum (potentially carrying handheld devices), 
communication media are heterogeneous and sometimes unreliable (a mix of 
wired and wireless networks are likely). An important consideration is that, if 
the system does not react timely to the movements and interests of the visitors, 
they will simply ignore it or, worse, will become annoyed. 

A typical problem that such a system should be able to solve is the follow- 
ing. A visitor is supposed to receive a multimedia presentation on a particular 
subject, but (1) several agents are able to produce it with different capabilities 
(pictures + text or audio + video) and variable availabilities (CPU load or com- 
munication possibilities) and (2) the visitor is close to several actuators (screens, 
speakers) each able to “display” the presentation, and the visitor carries a PDA 
which is also able to display it, albeit in a more limited way. In addition, several 
other visitors are nearby potentially using the actuators, and of course if the 
visitor does not get a presentation within a few seconds she will leave. 

3 Settings and Assumptions 

We deal here with cooperation with unreliable communication and highly dy- 
namic environment. We do not address the well-known problems of task decom- 
position or sub-goal negotiation. We assume that once agents are set to execute 
a task they know how to do it. Rather, we aim at achieving robustness and 
tolerance to failure in a setting where agents can be redundant, communication 
is unreliable, hardware can be switched off, and so on. Such an environment can 
evolve faster than the agents execute their task, so it is not feasible to use “tra- 
ditional” techniques e.g. guided team selection [22] or shared planning [9]. Our 
objective is to avoid centralized or static solutions like mediators, facilitators 
or brokers, but rather to have a fully distributed and flexible system, without 
looking for optimality. 

Our experimental communication infrastructure, used in PEACH, is called 
LoudVoice and is based on the concept of channeled multicast [1]. LoudVoice 
uses the fast but inherently unreliable IP multicast - which is not a major 
limitation when considering that, as said above, our communication media is 
unreliable by nature. Channels in LoudVoice can be easily discovered by their 
themes, that is, by the main subjects of conversation; a theme is just a string 
taken from an application-specific taxonomy of subjects, accessible as an XML 
file via its URL. Having discovered one or more channels, an agent can freely 
“listen” to and “speak” on them by means of FIPA-like messages, encoded as 
XML documents. The header of a message includes a performative, its sender 
and one or more destinations; the latter can be agent identifiers, but any other 
expression is accepted (for instance, we use role names - see Sec. 5). 
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4 Implicit Organizations 

Following a common convention in multi-agent systems, we define a role as a 
communication based API, or abstract agent interface (AAI), i.e. one or more 
protocols aimed at obtaining a cohesive set of functions from an agent. A sim- 
ple example is mentioned in [1], an auction system with two main roles: the 
auctioneer (which calls for bids, collects them and declares the winner) and the 
bidder (which answers to calls for bids and commits to perform whatever trans- 
action is requested when winning an auction). An agent may play more than 
one role, simultaneously or at different times depending on its capabilities and 
the context. 

We adopt the term organization from Tidlrar [23], to refer to teams where 
explicit command, control , and communication relationships (concerning team 
goals, team intentions, and team beliefs respectively) are established among sub- 
teams. 

We call implicit organization a set of agents tuned on the same channel to 
play the same role and willing to coordinate their actions. The term “implicit” 
highlights the facts that there is no group formation phase (joining an organi- 
zation is just a matter of tuning on a channel), and no name for it - the role 
and the channel uniquely identify the group, indeed. It is important to highlight 
that all agents of an implicit organizations play the same role but they may do 
it in different ways - redundancy (as in traditional fault tolerant or high capac- 
ity, load-balanced systems) is just a particular case where agents are perfectly 
interchangeable. This situation is commonly managed by putting a broker or 
some other form of middle agent supervising the organization. By contrast, our 
objective is to explore advantages and disadvantages of an approach based on 
unreliable group communication, in a situation where agents can come and go 
fairly quickly, their capabilities can change or evolve over time, and it is not 
necessarily known a-priori which agent can achieve a specific goal without first 
trying it out. 

An implicit organization is a special case of team. Generally speaking, a 
team includes different roles, and is formed in order to achieve a specific goal; as 
said above, an implicit organization includes all agents playing a given role on a 
channel at any given time. Goals for an implicit organization are automatically 
established by requests to achieve something and queries addressed to its role. 
In Tidhar’s terms, this is to say that a command relationship is established 
between any agent performing a goal-establishing communicative action (the 
“commanding agent”) and the implicit organization, whose consequence is that 
the latter is committed to achieving the goal. Section 5 below discusses the 
corresponding protocol. 

An implicit organization is in charge of defining its own control policy, which 
means: (1) how a sub-team is formed within the organization in order to achieve 
a specific goal; and, (2) how the intentions of this sub-team are established. 
For this initial work, a goal-specific sub-team is fixed to be simply all agents 
that are willing to commit immediately at achieving the organizational goal at 
the time this is established; i.e., there is no explicit sub-team formation, rather 
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Requester 



_!£!? PJ 



REQUEST ( Something ) 

/ QUERYJIF/REF} ( SomeBeliefs ) 



DONE ( Results ) 

/ INFORM ( Results ) 



Fig. 1. A role-based interaction 



introspection by each agent to decide whether or not it has enough resources 
immediately available. A coordination policy established for the organization is 
then used within a sub-team to decide who actually works towards achieving its 
goal, and possibly to coordinate the agents if more than one is involved. 

We assume that policies are well-known to agents; Section 8 describes some 
of those used in our domain. We assume the existence of a library of application- 
independent policies. Thus, it is possible to refer to a policy simply by its name. 
A policy, however, may have parameters that need to be negotiated before use 
- for instance, the currency used for auctions, or the master agent in a master- 
slave environment. We call policy instance a tuple composed of a policy name 
and ground values for its parameters. 

Note that a goal-specific sub-team may well be empty, e.g. when all agents of 
the implicit organization are busy or simply no agent is part of the organization. 
With unreliable communication, this case is effectively indistinguishable from the 
loss of the goal-establishing message (unless overhearing is applied; this is left to 
future work) or even from a very slow reaction; consequently, it must be properly 
managed by the commanding agent. These considerations have an important 
impact on the protocol between commanding agents and implicit organizations, 
as discussed below. 

5 Role-Based Communication 

In this initial work, we assume that any request - by which we mean any RE- 
QUEST and QUERY, using the FIPA performatives [8] - generates a commit- 
ment by an implicit organization to perform the necessary actions and answer 
appropriately (strategic thinking by the organization is not considered for now) . 
Thus, in principle the interactions between commanding agents and implicit 
organizations are straightforward, and can be summarized in the simple UML 
sequence diagram of Fig. 1. A generic Requester agent addresses its request to a 
role p on a channel; the corresponding implicit organization replies appropriately. 

As mentioned above, however, unreliable channels, continuous changes in the 
number of agents in the organization, and strict real-time constraint, substan- 
tially complicate the picture. Fig. 2 is a finite state machine that captures, with 
some simplifications, the possible evolutions of the protocol. The events on top 
half represent the normal case - no message loss, a goal-specific subteam achieves 
the goal. The events in brackets on the lower half represent degraded cases. 
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Fig. 2. Interacting with unreliable communication - a simplified protocol machine 



Consider, for instance, the cases where a request or its answer are lost, or no 
goal-specific sub-team can be formed. A timeout forces the commanding agent 
to reconsider whether its request is still relevant - e.g., the user’s context has not 
changed - and, if so, whether to resend the original message. It follows that an 
implicit organization must be able to deal with message repetitions. This implies 
that agents should discard repeated messages or re-issue any answer already sent; 
also, the coordination policies should contain mechanisms that prevent confusion 
in the cases of partial deliveries or new agents joining the organization between 
two repetitions. 

Similarly, the commanding agent has to deal with repeated answers, pos- 
sibly caused by its own repeated requests. In the worse case, these repeated 
answers may even be different - e.g., because something has changed in the 
environment, a new agent has joined the organization, and so on. Rather than 
introducing middle-agents or making the organizational coordination protocols 
overly complicated to prevent this to happen (which is likely to be something 
impossible to achieve anyway, according to [10]), our current choice is to have 
the requester to consider whatever answer it receives first as the valid one, and 
ignore all others. 

Of course, the protocol presented above can be used only in non-safety critical 
applications or in situations where some level of uncertainty can be tolerated. 
This is definitely the case in our multi-media environments, where quality of 
communications is fairly high and the objective is nothing more critical than 
providing some guidance and context-sensitive cultural information to visitors 
of museums. 

Observe that third parties overhearing a channel, in accordance to [3] , may 
help in making the interaction with implicit organizations much more robust - 
for instance, by detecting some message losses, or whether a goal-specific sub- 
team has been established. Exploring these possibilities is left to future work. 

6 Formalization 

This section provides a high-level formalization of the coordination within an 
implicit organization. This is done by means of a logic specifically designed for 
conversation policies, the Joint Intention Theory (JIT) [5,17]. JIT was born as 
a follow-up of a formalization of the theory of speech acts [18]. Recently, it has 
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been applied to group communication [12], and a particular form of diagram, 
composed of landmark expressions, has been introduced. This diagram “resem- 
bles state machines but instead of specifying the state transitions, it specifies a 
partially ordered set of states” [14] called landmarks and corresponding to JIT 
formulas. 

Strictly speaking, JIT is not applicable to our domain because we adopt 
unreliable communication, however it is very convenient for capturing certain 
aspects of teamwork; we return on this point in Sec. 6.1 below. 

JIT is expressed as a modal language with connectives of the first order logic 
and operators for propositional attitudes. Greek letters are used for groups, 
and lowercase variables for agents. We use the definitions from the mentioned 
papers [17,12,14], to which we add a few new ones in order to simplify the 
formulas introduced later. We identify an implicit organization with the role it 
plays, and indicate it with p; p(x) is true if x is member of p, which means 
that x is playing the role p on a given channel. We simplify the definition of 
group mutual belief given in [12] - that is: {MB n T 2 p) = {BMB n r 2 p ) A 
(. BMB T 2 ti p) - for the special case of “all agents of a group towards their 
group and viceversa”: 

(MB pp) =\/x p(x) D (BMB x pp) A (BMB p x p) 

Analogously, we extend the definitions of mutual goal (M G), joint persistent goal 
(JPG), and introduce group extensions of some others from [18], as follows. For 
simplicity, we consider the relativizing condition q always true, thus it will be 
omitted in all the following formulas. A mutual goal is defined as: 

(MG p p) = (MB p (GOAL p Op)) 

A weak achievement goal is: 

(WAG x p p) = [(BEL x ~>p) A (GOAL x Op)] V 

[(BEL xp) A (GOAL x O (MB p p))] V 
[(BEL x CHp) A (GOAL x 0(MB p CHp))] 

i.e. , x has a WAG toward p when it believes that p is not currently true, in which 
case it has a goal to achieve p, or if it believes p to be either true or impossible, 
in which case it has a goal to bring about the corresponding mutual beliefs. A 
weak mutual goal is: 

(WMG p p) = Vx p(x) D (MB p (WAG x p p) A (WAG p x p)) 

where: (WAG p y p) = Vx p(x) D (WAG x y p). A weak mutual goal is a 
mutual belief that each agent has a weak achievement goal towards its group for 
achieving p, and conversely the group has the goal towards its members. A joint 
persistent goal is: 

(JPG p p)= (MB p -i p) A (MG p p) A 

(UNTIL [(MB pp) V (MB p CHp)] (WMG p p)) 
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that is, p has a joint persistent goal p when there is a mutual belief that p is not 
currently true, there is a mutual goal to bring about p, and p will remain a weak 
mutual goal until there is a mutual belief that p is either true, or will never be 
true. 

The expression ( DONE pp), used in the following, is a group-extension of 
the definition of DONE given in [4], i.e. any agent in the group p believes that 
p happened immediately before the present time. 

Finally, we define Coord{Pt a r p), where Pi is a coordination policy, a is a 
group of agents, r is another agent, and p is a goal, as a function that computes 
the sequence of actions that must be performed by cr under policy Pi in order 
to achieve p and notify r when done. This sequence is composed of coordination 
actions and sub-goals assigned to individual agents, and must terminate with a 
DONE or INFORM message to r on the results. Recall that, by using a channeled 
multicast infrastructure such as Loud Voice, everybody listening to a channel 
receives everything sent on it; thus, the action of sending the results to r also 
establishes - via overhearing - a belief in the listeners that the goal has been 
reached (or is not reachable). 

With the definitions given above, we can now formalize how an implicit 
organization must behave to achieve a goal p commanded by a request from 
agent r to the role p (Section 5). In the following, CurrentPolicy Decided rep- 
resents whether or not a coordination policy has already been established, i.e. 
CurrentPolicyDecided = 3 Pi ( CurrentPolicy Pi), where Pi is a policy instance 
(Sec. 4). a represents the goal-specific sub-team instantiated to achieve p. The 
behaviour of the implicit organization is represented by the landmark diagram 
of Figure 3, where the landmarks correspond to the following JIT expressions: 

LI: -n(DONEpp) A {JPG p {{DONE pp) /\ {BEL r p))) 

L2: {MB p CurrentPolicyDecided ) A {INTEND <j Coord{Pi o r p)) 

L3: {MB p {{DONE pp) V UNDONE pp))) 

L4: {M B p ^CurrentPolicyDecided) A {JPG p CurrentPolicyDecided) 

A {JPG p {{DONE pp) A {BEL r p))) 

The machine starts executing (i.e. land- 
mark LI is entered) when a request for p 
from r arrives to an implicit organization p , 
and concludes (i.e., L3 is reached) when the 
request is satisfied. In other words, Fig. 3 
is an expansion of the state Organization 
Preparing Answer of the diagram of Fig. 2. 

Landmark LI says that the requested goal 
has not been achieved yet, and that p has a 
joint persistent goal to achieve it and notify 
r. However, achieving p is possible only if the organization knows how to coordi- 
nate. If this is the case, i.e. a coordination policy has already been established, 
then it moves to landmark L2, otherwise to L4. The transition from L4 to L2 is 
called policy negotiation, and is discussed in Sec. 7 below. 




Fig. 3. Coordination in an implicit 
organization 
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In the transitions from LI to L2 and from L4 to L2, a sub-team a is formed 
and given an intention to achieve p on behalf of p. As stated in Sec. 4, in this work 
we assume that a are all agents able to achieve p at the time the request arrives 
(i.e., when LI is entered). In a policy-specific way, a does whatever is required 
to achieve p. The notification of the results to r is overheard by everybody in 
p, thus it establishes the mutual belief within the organization that satisfies its 
joint goal (landmark L3). 

6.1 Common Knowledge, JIT, and LoudVoice 

As mentioned in Sec. 3, our communication infrastructure LoudVoice privileges 
speed over reliability. In our settings (typically a standard LAN extended with 
a wireless infrastructure), we can assume practically instantaneous transport 
of messages on a channel with respect to the evolution of the environment, 
and a certain probability that - for whatever reason, including network jams 
and inherent unreliability of wireless links - messages can be occasionally lost 
and temporary network partitions happen (typically when mobile devices move 
through areas not covered by any signal). Apparently, this goes against using 
JIT for formalization, because JIT assumes the possibility of establishing mutual 
beliefs within a team. 

In a simplistic interpretation, a mutual belief is a form of common knowl- 
edge [10], e.g. an agent i believes that an agent j believes 6, j believes that i 
believes that j believes b, i believes that j believes that i believes that j believes 
6, and so on forever. “Perfect” common knowledge cannot be attained with ei- 
ther unreliable group communication or reliable but non-instantaneous delivery 
of messages. Specifically, Halpern and Moses in [10] formalize well-known results 
in distributed systems theories, i.e. that synchronized action based on perfect 
common knowledge is not possible. 

By applying game theory, Morris and Shin reached an important result on 
probabilistic coordination [16]. In short, they show that, in non-strategic games 
- i.e. where all agents play by rules determined a-priori independently of their 
individual objectives and preferences - and a reasonably small probability of 
losing messages, it is possible to design protocols which achieve coordination (in 
game tlreory-speak, maximize payoff) with a probability arbitrarily close to the 
desired one. Non-strategic behavior, pre-defined rules, and small probability of 
loss, are all characteristics of our domain. We designed our protocols to be robust 
against single message losses or the occasional network partitioning, thus giving 
a high level of confidence on their outcomes. Some mechanisms, such as the 
periodic reminders sent on the control channel about the current policy adopted 
by an implicit organization (Sec. 7.3), have the specific purpose of increasing 
that confidence by detecting problems as soon as possible. 

Contrary to the interpretation given above, Kumar et al. argue [13] that a 
mutual belief does not imply common knowledge - indeed, agents may be wrong 
about their beliefs, including their mutual ones. Kumar mentions a number of 
works on the establishment of mutual beliefs by default rules, and introduces a 
set of defeasible rules of communication that cause belief revision when an agent 
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realizes that some assumptions on others’ beliefs were wrong. Observe that our 
periodic reminders mentioned above serve exactly this purpose, i.e. triggering a 
belief revision when necessary. 

Our conclusion is that we can adopt JIT for a high-level formalization in 
spite of using unreliable communication. However, the protocols described in 
this work were not directly derived from landmark expressions (as done by [14]) 
- this would have left fixing failures to the rules mentioned above. Rather, we 
decided to deal directly with message losses in order to reach our required level 
of confidence. Automatic protocol derivation from JIT in our setting may be the 
objective of future work. 



7 Negotiating a Policy 

This section formalizes the negotiation protocol for the organizational coordi- 
nation policy, and describes an algorithm to be applied with unreliable group 
communication. In summary, the negotiation protocol works in two phases: first, 
the set of policy instances common to all agents is computed; then, one is se- 
lected. 



7.1 Defining Policy Instances 

Recall, from Sec. 4, that a policy instance is a tuple with the name of a policy 
and ground values for all its parameters; e.g.: <auction, Euro>, represents an 
“auction” policy whose first parameter (presumably the currency) is “Euro” . A 
potential issue with our two-plrase protocol is that an agent may support a very 
large (even infinite) or undetermined number of policy instances: this happens, 
for instance, when a parameter may take any integer value, or has to be set to 
the name of one of the agents of the organization. To this end, we have designed 
a language for expressing sets of policy instances in a compact way by means 
of constraints on parameters [2]. These are currently limited to be integers or 
strings, but this is enough for our purposes. An expression in this language 
describes a (possibly infinite) set of policy instances. Reducing two expressions 
means to generate zero or more expressions that describe the intersection of the 
sets in input. The intersection of policies performed during negotiation (see the 
algorithm later) is thus the reduction of all expressions exchanged during the 
protocol. Simple examples of expressions are: 

name = MulticastContractNet 

param = Currency costraint = (one of Dollar, Euro) suggested = (Euro) 

param = WinningCriteria value = lowest 

param = BidTimeout constraint = (in 100.. 2000 ) suggested = (1000) 
name = MasterSlave 

param = Master constraint = (not one of Agentl, Agent2) 
suggested = (Agent3, Agent4) 



name = any 
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The last expression represents all possible policy instances, and is used by agents 
leaving an organization or external to it when triggering a negotiation, as shown 
later. 



7.2 Formalization 

From the perspective of an individual member x, the coordination policy of an 
organization, independently from its specific value, goes through three different 
states: “unknown” , “negotiating” , and “decided” . The policy is unknown when 
the agent joins the organization, i.e. when it tunes on a channel to play the 
organization’s role; the first goal of x is to negotiate it with the other members. 
At that point, the following applies: 

($ Pi ( BEL x ( CurrentPolicy Pi))) A 

(. INTEND x SEND(p, RE QUEST (Cur rent Policy ?))) 

that is, the agent does not hold any belief on the current policy, and asks to 
the other members about it. This request is interpreted by the organization as a 
notification that a new member has joined. In turn, this triggers the negotiation 
protocol, modeled by the landmark diagram of Fig. 4, whose landmarks are the 
following: 

LI: ( JPG p Current Policy Decided) 

L2: ( JPG p Current Policy Decided) A (MB p ( CommonPolicies 77)) 

A ( GOAL o (MB p (CurrentPolicy selectFrom(o , 77)))) 

L3: (MB p (CommonPolicies 77)) A (MB p (CurrentPolicy Pi)) 

LI represents the state of the implicit 
organization when negotiation is started. 

There is a mutual belief that no policy has 
been currently decided within the organiza- 
tion. This happens, for instance, when a new 
agent joins - the REQUEST it sends makes the organization aware that the mu- 
tual belief on the current policy, if it had ever held before, does no longer hold and 
must be re-established. Thus, p sets for itself a joint persistent goal of agreeing 
on the current policy. 

As mentioned earlier, this is done in two steps. In the first, corresponding 
to the transition from LI to L2, a mutual belief about the policy instances 
common to the entire group is established. This can be easily achieved by ex- 
changing reciprocal INFORMS on what each agent is able to support, and inter- 
secting their contents; the resulting set of policy instances is the mutual belief 
(CommonPolicies 77). 

At this stage (landmark L2) , a special agent o, called the oracle , assumes the 
goal of establishing the CurrentPolicy of the role; the function selectFrom(o, IT) 
is evaluated by o and returns a member of its input set. Thus, the second step 
of the protocol (the transition from L2 to L3) consists of the decision-making by 




Fig. 4. Policy negotiation protocol 
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the oracle and the establishment of the current policy, which can be done by o 
sending an ASSERT to p about the value returned by selectFrom(o, 77). This 
assertion, in turn, causes the agents to know that Current Policy Decided is now 
true, and thus the related joint goal is automatically satisfied. 

The oracle can be anybody, a member of the organization as well as an 
external agent (the latter can overhear the negotiation protocol, thus it knows 
the common policies). It can apply whatever decision criteria it deems more 
appropriate - from a random choice, to a configuration rule, to inferring on 
previous policies based on machine learning, and so on. In this work, we do not 
elaborate on how the oracle is chosen, nor on its logic (but examples are given in 
Sec. 9). The algorithm presented below: (1) allows for any external agent (such 
as a network monitor or an application agent interested in enforcing certain 
policies) to intervene just after the common policies have been established; (2) 
provides a default oracle election mechanism if an external one is not present; 
and, (3) handles conflicting oracles by forcing a re-negotiation. 

7.3 From Theory to Practice 

As discussed in Sec. 6.1, JIT does not work with unreliable communication. A 
practical protocol for the decision of a policy has to take this issue into account, 
as well as the problem of a highly dynamic environment where agents can join 
and leave very quickly. To this end, the protocol implemented by the algorithm 
described below adds one information and a few more messages to the formal 
model given above. 

All messages concerning policy negotiation are marked with a Negotiation 
Sequence Number (NSN). A NSN is the identifier of a negotiation process, and 
is unique during the lifetime of an organization. NSNs form an ordered set; an 
increment (nsn) function returns a NSN that is strictly greater than its input 
nsn. In our current implementation, a NSN is simply an integer; the first agent 
joining an organization sets the NSN of the first negotiation to zero. 

Goal of the NSN is to help in guaranteeing coherence of protocols and in- 
tegrity of mutual beliefs in the cases of message losses, network partitioning, and 
new agents joining mid-way a negotiation. As described in the algorithm below, 
messages related to policy negotiation are interpreted by an agent depending on 
their NSN. Messages containing an obsolete NSN are simply discarded, possibly 
after informing the sender that it is out of date. Messages that have a NSN newer 
than the one known to the agent are also ignored but cause the agent to enter 
into a negotiating state. Only messages whose NSN is equal to the known one 
are actually handled. 

The protocol is made robust in two other ways. First, during the transition 
from LI to L2, rather than exchanging simple INFORMS on the policies they 
support, agents send repeated INFORMS on the known common policies and 
participating agents. Second, a policy reminder message, consisting of an IN- 
FORM on what is believed to be the current policy, is periodically sent by each 
agent after the end of the negotiation, allowing recovery from the loss of the 
oracle announcement and consistency checking against other problems. 
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Note that a network partitioning may cause two sub-organizations, each liv- 
ing on its own partition, to increment their NSN independently; when the par- 
titions are rejoined, the integrity checks on the NSNs described above cause the 
organization to re-negotiate the policy. 

A member of an organization can be modeled as having three main beliefs 
directly related to policies: the supported policies, the common policies, and the 
current policy. As its name suggests, the first belief consists of all the policy 
instances that an agents supports: SupportedPolicies(p , {Pi . . . Pk}), where p is 
the role and {P, } is a set of policy instances, typically defined by the agent devel- 
oper. The common policies belief contains a set of agents participating to a nego- 
tiation (identified by its NSN) and the policies supported by everybody, i.e. the 
intersection of their supported policies: C ommonPolicies(p , NSN, {Pi . . . Pk} , 
{Ai . . . A,}), where p is the role and {Pi} is the intersection of all the policy in- 
stances supported by agents {Ai . . . Ai}. Finally, the current policy is the policy 
instance decided by the oracle for a given negotiation (identified by its NSN): 
CurrentPolicy{p, NSN, Pk). 

7.4 The Algorithm 

This section describes, as pseudo-code, the negotiation algorithm performed by 
each agent of an implicit organization. 

In the following, we assume that an agent is a simple event-driven rule engine. 
We adopt some obvious syntactical conventions, a mixture of generic event-action 
rules, C (for code structuring, == for testing equality and ! = for diversity) and 
Pascal (variable declarations as name: type). The suspend primitive blocks the 
execution of the calling rule until its input condition (typically a timeout) is sat- 
isfied; during the time of suspension, other rules can be invoked as events unfold. 
A few primitives are used to send messages (INFORM, ASSERT, REQUEST) to 
the role for which the policy is being negotiated; for simplicity, we assume that 
the beliefs being transmitted or queried are expressed in a Prolog-like language. 
For readability, the role is never explicitly mentioned in the following, since the 
algorithm works for a single role at the time. 

The three main beliefs described above and the state of the negotiation are 
represented by the following variables: 

myself: Agent_Identif ier ; 
myNSN : Negotiation_Sequence_Number ; 
negotiationState: {UNKNOWN, NEGOTIATING, DECIDED}; 
supportedPolicies : set of Policy; 
commonPolicies : set of Policy; 
negotiatingAgents : set of Agent_Identif ier ; 
currentPolicy : Policy; 

When the agent starts playing a role, it needs to discover the current situation of 
the organization, in particular its current NSN. This is done by sending a query 
to the role about the current policy. If nothing happens, after a while the agent 
assumes to be alone, and starts a negotiation - message losses are recovered by 
the rest of the algorithm. 
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on_Start() { 

set negotiationState = UNKNOWN; 

set myNSN = MIN.VALUE; 

set myself = getOwnAgentldentif ier () ; 

REQUEST (CURRENT_POLICY(? , ?) ) ; 
suspend until timeout; 

if ( (currentPolicy == nil) AND (negotiationState == UNKNOWN)) 

negotiate (increment (myNSN) , supportedPolicies , set_of (myself)); 

} 

on_Request ( CURRENT_POLICY (input_NSN, input_policy) ) { 
if ((input_NSN == ?) AND (input_policy == ?) AND (currentPolicy != nil)) 
INFORM ( CURRENT_POLICY (myNSN .currentPolicy) ); 
else } 

The negotiation process mainly consists of an iterative intersection of the policies 
supported by all agents, which any agent can start by sending an INFORM with 
its own supported policies and a NSN higher than the one of the last negotiation 
(see negotiate () later on). Conversely, if the agent receives an INFORM on 
the common policies whose NSN is greater than the one known to the agent, it 
infers that a new negotiation has started, and joins it. 

The contents of the INFORMS on the common policies that are received 
during a negotiation are intersected as shown in the following. If the resulting 
common policies set is empty, i.e. no policy can be agreed upon, the agent notifies 
a failure condition, waits for some time to allow network re-configurations or 
agents to leave, and then restart the negotiation again. 

on_Inform ( C0MM0N_P0LICIES (input_NSN, input_policies , input_agents) ) { 
if (input_NSN > myNSN) 
negotiate ( input_NSN, 

intersect (supportedPolicies, input_policies) , 
union ( input _ agent s , set_of (myself )) ); 

else 

if ((input_NSN == myNSN) AND (negotiationState == NEGOTIATING)) { 
commonPolicies = intersect (commonPolicies , input_policies) ; 
negotiatingAgents = union (negotiatingAgents , input_agents) ; 
if (commonPolicies == {empty set}) { 

{notify failure to user}; 
suspend until timeout; 

negotiate (increment (myNSN) , supportedPolicies, set_of (myself)); 

} 

} 

else 

if (negotiationState == DECIDED) 

INFORM (CURRENT_POLICY (myNSN, currentPolicy) ) ; 
else 

if (negotiationState == NEGOTIATING) 

INFORM (C0MM0N_P0LICIES (myNSN, commonPolicies, negotiatingAgents)); 

} 
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A negotiation is either started by the agent, which sets the initial value of the 
common policies to those it supports, or joined when receiving information on the 
common policies (see the code above). As described earlier, during the first phase 
of a negotiation the agent informs the channel about the policies it knows as com- 
mon, then waits for a period, during which it collects INFORMS from the other 
members of the organization. This process is repeated for (at most) max_repeats 
times, to allow the synchronization of the mutual beliefs on the common policies 
and the recovery of any lost message. It follows that max_repeats is a sensi- 
tive parameters, whose value mainly depends on the reliability of the transport 
media in use for the channels: if this is very reliable, then max_repeats can be 
low (two or three). The set of negotiating agents, which is not exploited by this 
algorithm, may be used by a more sophisticated algorithm for a better recovery 
(e.g., by comparing this set with the set of those agents from which messages 
have been actually received). 

After max_repeats repetitions, the set of common policies should correspond 
to the intersection of all policies accepted by all negotiating agents. It is now 
time to choose a policy from this set. As mentioned in the previous section, 
this is done either by an external oracle, or - if none is present - by an agent 
chosen with an arbitrary heuristic. Below, we choose the agent with the lowest 
identifier (after checking the NSN, to prevent confusion when negotiate () is 
called recursively by a new negotiation starting in the middle of another). The 
self-nominated oracle can apply whatever criteria it prefers to pick a policy; here, 
we use a simple random choice. 

procedure negotiate (negotiation_NSN : NSN, 

initial_policies : set of Policy, 
initial_agents : set of Agent_Identif ier ) { 
set negotiationState = NEGOTIATING; 
set myNSN = negotiation_NSN; 
set commonPolicies = initial_policies ; 
set negotiatingAgents = initial_agents ; 
currentPolicy = nil; 

INFORM (C0MM0N_P0LICIES (myNSN, commonPolicies, negotiatingAgents)); 
repeat max_repeats times { 
suspend until timeout; 
if (currentPolicy != nil) 

break; /// out of the ’repeat’ block 
INFORM (C0MM0N_P0LICIES (myNSN, commonPolicies, negotiatingAgents)); 

> 

suspend until (timeout OR currentPolicy != nil); 
if ((currentPolicy == nil) AND (myNSN == negotiation_NSN) AND 
(Lowestld (negotiatingAgents) == myself)) { 
set currentPolicy = random_choice (commonPolicies) ; 

ASSERT (CURRENT_POLICY (myNSN, currentPolicy) ) ; 

> 

while (currentPolicy == nil) 

suspend until (timeout OR currentPolicy ! = nil) ; 
set negotiationState = DECIDED; 
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When an ASSERT of the current policy is received, the agent does a few checks 
to detect inconsistencies - for instance, that the NSN does not refer to a different 
negotiation, or that two oracles have not attempted to set the policy indepen- 
dently. If everything is fine, the assertion is accepted, causing negotiate () to 
finish (see the code above). Otherwise, the assertion is either refused, or triggers 
a new negotiation. 

on_Assert( CURRENT_POLICY ( input_NSN, input_policy ) ) { 
if ( input _NSN == myNSN) { 

if ( ( (currentPolicy == nil) AND 

commonPolicies . contains (input_policy) ) OR 
(currentPolicy == input_policy) ) 
set currentPolicy = input_policy ; 
else 

negotiate (increment (myNSN) , supportedPolicies , set_of (myself)); 

} 

else 

if (myNSN < input _NSN) 

negotiate (increment (input_NSN) , supportedPolicies, set_of (myself)); 

} 

Since the assertion of the current policy can be lost, and to prevent inconsisten- 
cies caused for instance by network partitioning and rejoining, periodically each 
agent reminds to the group what it believes to be the current policy. The fre- 
quency of the reminders may change depending on the reliability of the transport 
media. When an agent receives a remainder, it checks for consistency with what 
it believes - i.e., that the NSN and the policy are the same. If not, depending 
on the situation it may either react with an inform (to let the sender know of a 
likely problem and possibly re-negotiate) or by triggering a negotiation itself. 

on_PolicyReminder_Timeout () { 

INFORM (CURRENT_POLICY (myNSN, currentPolicy) ) ; 

set policy_reminder_timeout = getPolicyReminder_Timeout () ; 

> 



on_Inform ( CURRENT_POLICY (input_NSN, input_policy) ) { 
if (input_NSN > myNSN) 

negotiate (increment (input _NSN) , supportedPolicies, set_of (myself )) ; 
else 

if (input_NSN < myNSN) { 

if (negotiationState == DECIDED) 

INFORM (CURRENT_POLICY (myNSN, currentPolicy)); 

} 

else /** that is, input_NSN == myNSN **/ 

if ((currentPolicy == nil) AND commonPolicies . contains (input_policy) ) 
set currentPolicy = input_policy ; 

else 

if (input_policy != currentPolicy) 

negotiate (increment (myNSN) , supportedPolicies, set_of (myself )) ; 
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Finally, when an agent leaves the channel, it has a social obligation to start a 
new negotiation to allow the others to adapt to the changed situation. This is 
done by triggering the negotiation process with an INFORM about the common 
policies, with an incremented NSN and the policies set to a special value any 
which means “anything is acceptable” (Sec. 7.1). 

on_Leave() { 

INFORM ( C0MM0N_P0LICIES (increment (myNSN) , set_of ("any") , nil) ); 

} 

8 Coordination Protocols 

Recall, from Sec. 4, that when an implicit organization receives a request (i.e. , 
the state Organization Preparing Answer of Fig. 2 is entered), its members with 
available resources form a sub-team. This happens silently, i.e. there is no ex- 
plicit group formation message. As described in Sec. 7, if an organizational co- 
ordination policy has not been established yet, then one is negotiated among all 
members of the organization (these are the transitions from LI to L4 and then 
to L2 in Fig. 3). Once the policy has been decided, the goal-specific subteam can 
start working. 

All policies have to follow a straightforward, abstract three-phase schema, 
summarized in Fig. 5. Before doing anything, the sub-team coordinates to form 
a joint intention on how to achieve the goal (Pre-Work Coordination). The 
agents in charge perform whatever action is required, including any necessary 
on-going coordination (Working). When finally everybody finishes, a Post-Work 
Coordination phase collects results and replies to the requester (which corre- 
sponds to the Answer Issued event of Fig. 2). 

We describe in the following four basic organizational coordination policies 
that we use in our domain; examples of their application are in next section. 
Many other variants and alternatives can be designed to meet different require- 
ments, such as Quality of Service objectives, efficiency, and so on. 

The Plain Competition policy is nothing more than “everybody races 
against everybody else, and the first to finish wins”. It is by far the easiest 
policy of all: no pre-work nor post-work coordination is required, while the on- 
going coordination consists in overhearing the reply sent by who finishes first. 
In summary, Plain Competition works as follows: when a role receives a request, 
any agent able to react starts working on the job immediately. When an agent 
finishes, it sends back its results (as an INFORM or DONE to the requester, in 
accordance to Fig. 1). The other agents overhear this answer and stop working 
on the same job. 



Pre-Work 

Coordination 



K 



Working 



Post-Work 

Coordination 



Fig. 5. Abstract coordination policy, UML state diagram 





248 



Paolo Busetta et al. 



A race condition is possible: two or more agents finish at the same time 
and send their answers. This is not a problem, as explained in Sec. 5, since the 
requester accepts whatever answer comes first and ignores the others. 

The Simple Collaboration policy consists of a collaboration among all 
participants for synthesizing a common answer from the results obtained by each 
agent independently. This policy does not require any pre-work coordination; the 
on-going coordination consists in declaring what results have been achieved, or 
that work is still on-going; finally, the post-work coordination consists in having 
one agent (the first to finish) collecting answers from everybody else and sending 
a synthesis to the requester. 

Simple Collaboration works as follows. As in Plain Competition , all agents 
able to react start working on the job as soon as the request is received. The first 
agent that finishes advertises his results with an INFORM to the role, and moves 
to the post-work phase. Within a short timeout (a parameter negotiated with 
the policy, usually around half a second), all other members of the subteam must 
react by sending, to the role again, either an INFORM with their own results, 
or an INFORM that says that they are still working followed by an INFORM 
with the results when finally done. The first agent collects all these messages 
and synthesizes the common result, in a goal-dependent way. Let us stress that, 
to support this policy, an agent must have the capabilities both to achieve the 
requested goal, and to synthesize the results. 

As in the previous case, a race condition is possible among agents finishing at 
the same time, so more than one may be collecting the same results; as always, 
multiple answers are not a problem for the requester, and generally imply a 
minor waste of computational resources because of the synthesis by multiple 
agents. 

The Multicast Contract Net policy is a simplification of the well-known 
Contract Net protocol [19], where the Manager is the agent sending a request to 
a role, and the award is determined by the bidder themselves, since everybody 
knows everybody else’s bid. Thus, effectively this policy contemplates coordi- 
nation only in the pre-work phase, while neither on-going nor post- work are 
required. This policy has three parameters: the winning criteria (lowest or high- 
est bid), a currency for the bid (which can be any string), and a timeout within 
which bids must be sent. 

Multicast Contract Net works as follows. As soon as a request arrives to the 
role, all participating agents send their bid to the role. Since everybody receives 
everybody else’s offer, each agent can easily compute which one is the winner. At 
the expiration of the timeout for the bid, the winning agent declares its victory 
to the role with an INFORM repeating its successful bid, and starts working. 

Some degraded cases must be handled. The first happens when two or more 
agents send the same winning bid; to solve this issue, as in the policy negotiation 
protocol, we arbitrarily chose a heuristics, which consists in taking as winner 
the agent with the lowest communication identifier. The second case happens 
because of a race condition when the timeout for the bid expires, or because of 
the loss of messages; as a consequence, it may happen that two or more agents 
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believe to be winners. This is solved by an additional, very short wait after 
declaring victory, during which each agent believing to be the winner listens for 
contradictory declarations from others. 

In spite of these precautions, it may happen that two or more agents believe 
to be the winners and attempt to achieve the goal independently. The winner 
declaration mechanism, however, reduces its probability to the square of the 
probability of losing a single messages, since at least two consecutive messages 
(the bid from the real winner and its victory declaration) must be lost by the 
others. 

The Master-Slave policy has many similarities with the Multicast Contract 
Net; the essential difference is that, in place of a bidding phase, a master decides 
which agent is delegated to achieve a goal. The master is elected by the policy 
negotiation protocol. Typically, agents that support this policy either propose 
themselves as masters, or accept any other agent but refuse to be master them- 
selves; this is because the master must react to all requests in order to select a 
slave delegated to achieve a goal, and must have an appropriate selection logic. 

Master-Slave works as follows. When a request arrives, all agents willing 
to work send an INFORM to the role to declare their availability. The master 
(which may or may not among those available to work) collects all declarations, 
and issues an INFORM to the role nominating the slave, which acknowledges by 
repeating the same INFORM. Message loss is recovered by the master handling 
a simple timeout between its nomination and the reply, and repeating the IN- 
FORM if necessary. Thus, it is not possible for two agents to believe to be slaves 
for the same request. 



9 Practical Examples 



We elaborate on three examples, which have been chosen to show some prac- 
tical implicit organizations and the usage of the policies discussed in Sec. 8. 
Interactions will be illustrated with a simplified, FIPA-like syntax. 

Our first example is of collaboration among search engines. ACitationFinder 
accepts requests to look for a text in its knowledge base and returns extracts as 
XML documents. For the sake of illustration, we model searching as an action 
(e.g., as in scanning a database) rather than a query on the internal beliefs of 
the agent. An example of interaction is: 



REQUEST 

From: UserAssistant033 
To: CitationFinder 
Content : 

find ( Michelangelo ) 



DONE 

To: UserAssistant033 
Content: done ( find (Michelangelo), 
results (<docl>born in Italy</docl> , 
<doc2> . . . </doc2> , <doc3> . . . </doct3>) ) 



Typically, different CitationFinders work on different databases. Any coor- 
dination policy of those presented above seems to be acceptable for this simple 
role. Particularly interesting is Simple Collaboration, where an agent, when done 
with searching, accepts to be the merger of the results; indeed, in this case, syn- 
thesizing is just concatenating all results by all agents. Consider, for instance, 
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the situation where CitationFinders are on board of PDAs or notebooks. A user 
entering a smart office causes its agent to tune into the local channel for its role; 
consequently, in a typical peer-to-peer fashion, a new user adds her knowledge 
base to those of the others in the same room. This could be easily exploited to 
develop a collaborative work (CSCW) system. In this case, collaboration may be 
enforced by the CSCW agent by acting as the oracle during policy negotiation. 

The second example is of competition among presentation planners. The 
interactive museum we are developing need PresentationPlanners ( PP for 
short) for generating multi-media presentations on specific topics relevant to the 
context where a user, or a group of users, is. For instance, a user getting close 
to an art object should receive - depending on her interests, profile, previous 
presentations she received, etc. - a personalized presentation of the object itself, 
of its author, possibly of related objects in the same environment or other rooms. 
A typical interaction looks like the following: 



REQUEST 

From: UserAssistant033 
To: PresentationPlanner 
Content : 

buildPresentationForUser (621) 



DONE 

To: UserAssistant033 
Content : done ( 

buildPresentationForUser (621) , 
results ( 

file (http : //srv/presl377 .ram) , 
bestResolution ( 800, 600 ) , 
includeVideo ( true ) , 
includeAudio ( true ) ) ) 



Typically, a PP works on a knowledge base containing text, audio and video 
tracks. When a request arrives, the PP collects data on the user and her contexts, 
e.g. by querying roles as UserProf iler , RoomLayout. Then, it queries its own 
knowledge base and, if information is available, it builds a multimedia presenta- 
tion, by connecting audio and video tracks, generating audio via text-to-speech 
systems, and so on. 

A PP is often the leader of its own team, formed by highly specialized agents. 
By contrast, it is unlikely that different PPs collaborate - sensibly merging 
multi-media presentations is a hard task even for a human. Observe that, in 
realistic situations, redundancy of PPs is a necessity, e.g. to handle the workload 
imposed by hundredth of simultaneous visitors. Redundancy can be obtained 
in various ways, for instance by putting identical PPs working on the same 
knowledge bases, or by specializing them by objects, or by rooms, or by user 
profiles. 

Given the variety of possible configuration choices, the best policies for a 
PP are Plain Competition and Multicast Contract Net based on some quality 
parameter; it may also be that, in well controlled situations, a Master can be 
elected (or imposed by an oracle). Thus, the developer of a PP should enable a 
number of non-collaborative policies, which means specifying criteria for bidding 
in a Multicast Contract Net, accepting a Master-Slave policy but preventing its 
own PP from becoming a master, and so on. 
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Our third and last example is about choosing among multiple screens. Smart 
rooms may contain multiple places where to show things to users, e.g. large 
screens on different walls, the users’ own PDAs, computers scattered in the 
room. Location, but also quality and logical congruence with the tasks being 
performed by the user, are all important factors. Also, it is not necessarily the 
case that a single screen is a good choice - for example, a presentation to a group 
of people may be better shown in multiple locations simultaneously. 

For our interactive museum, we are working on SmartBrowsers ( SB for 
short). A SB is an agent able to show a multi-media presentation (video and 
audio) and aware of its position (which may be static, if its display is a wall 
screen, or mobile, if on board of a PDA). A typical interaction looks like the 
following: 



REQUEST 

From: UserAssistant033 
To: SmartBrowser 
Content : 

showMultiMedia ( 
user (621), 

file (http ://server/presl377. ram) , 
bestResolution ( 800, 600 ), 
includeVideo ( true ) , 
includeAudio ( true ) 

) 



DONE 

To: UserAssistant033 
Content : 
done ( 

showMultiMedia () , 
results ( completed ) ) 



SBs should accept a policy that allows a clear selection of one, or (better) 
a fixed number of agents at request time. Thus, Plain Competition and Simple 
Collaboration are not suitable; Master-Slave works, but seems unduly restrictive 
in a situation where SBs are context aware. We are working on a Multicast 
Contract Net policy where bids are a function of screen resolution, distance from 
user, impact on other people in the same room (e.g. when audio is involved). Only 
SBs visible to the user from her current position, having all required capabilities, 
and not busy showing something else, can participate to the sub-team bidding 
for a multi media presentation. 



10 Related Works 

Team programming based on joint intentions has been explored by various au- 
thors, in addition to those already mentioned in Sections 4 and 6. For instance, 
STEAM [21] combines Joint Intentions Theory and SlraredPlans [9], and adds 
decision theoretic communication selectivity in order to adapt type and amount 
of communication to the context, by taking in account its costs and its benefits 
with respect to the team’s goals. 

Dignum et al. [7] deal with team formation via communication, something 
that is clearly related to our work. Our primary objective is not team forma- 
tion, since this is solved by the very definitions of implicit organization and 
goal-specific sub-team - however, as mentioned in Sec. 4, the way the latter is 
formed may become more sophisticated in future. Moreover, with respect to the 
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decomposition in steps required to form a team proposed by the model in [7], we 
can draw a parallel between their “potential recognition” step, where an agent 
takes the initiative to find a team, and the situations where a new agent joins 
an organization, or takes the initiative of renegotiating the coordination policy 
(and consequently how tasks are allocated among agents) because, for instance, 
of the poor performance of the current policy. 

11 Conclusions and Future Directions 

We proposed implicit organizations for the coordination of agents able to play the 
same role, possibly in different ways, exploiting group communication and over- 
hearing in environments where messages may be occasionally lost and agents can 
join and leave very frequently. We presented a protocol for negotiating a common 
coordination policy, outlined a general organizational coordination protocol, and 
discussed a few examples. 

Current work is focusing on practical experimentation and application to 
our domain, i.e. multi-media, multi-modal cultural information delivery in smart 
rooms (“active museums”). Preliminary performance results on the negotiation 
algorithm show that, on a real, busy LAN, 20 agents can negotiate their own 
common policy in less than a second, without any external oracle intervention. 

Longer term research objectives include more investigation on overhearing. 
We envision the creation of overhearing agents, helping in achieving robustness 
by catching and recovering partial message losses, supervising the behavior of 
implicit organizations, and applying machine learning techniques for deciding 
the “best” policy for a role in a given environment. 
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Abstract. We present in this paper an extension to our overhearing- 
based group formation framework OTTO (Organizing Teams Through 
Overhearing). This framework allows a team to dynamically re-organize 
itself in accordance to the evolution of the communication possibilities 
between agents. In OTTO, each agent overhears some of the messages 
exchanged within the team and uses them to incrementally update a 
map of the organization of the team. One of the key points of OTTO 
is that only few resources are used - a small data structure is used to 
track each team member and there is no need to store the overheard 
messages. With lotto (otto for Large numbers of agents) we address 
the problem of large teams in which using OTTO “as is” would be costly 
in terms of memory and therefore contrary to otto’s philosophy of low 
resource usage. Therefore, we have implemented a strategy that allows 
an agent with a limited memory to track only a part of the team. We 
have run a series of experiments in order to evaluate the impact of this 
limitation and present some results. 



1 Introduction 

In this paper, we present an extension to an overlrearing-based group forma- 
tion framework called otto (Organizing Teams Through Overhearing), otto 
deals with cooperation between non-selfislr agents in communication-limited and 
dynamic environments. More precisely, the focus is on cooperation with local 
communication. The property of locality is classically induced by the discrep- 
ancy between the range of communication and the distance between the agents, 
but we generalize this to encompass situations in which (1) every agent cannot 
communicate with every other team member at will, and (2) the communica- 
tion possibilities between agents remain the same during (short) time intervals. 
In such conditions, a solution to achieve cooperation consists in creating local 
groups of agents that are able to interact during a certain period. Thus, otto 
allows the dynamic evolution of groups (subteams) within a team of agents ac- 
complishing a collaborative task. Each group accomplishes the task with no (or 
little) coordination with other groups, otto is task- independent, its objective is 
to allow the formation of groups despite very adverse communication conditions. 
The responsibility for the adequacy between the agents’ skills and the needs of 
the group is left to the agents and particularly the leaders (the agents at the 
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Fig. 1. Organization map. 



origin of groups). Considering the dynamic nature of the environment, we avoid 
over-long negotiations: agents decide to petition the leader for acceptance in the 
group and the leader accepts or rejects them, both basing their decisions on 
task-dependent information. 

Initially, otto was designed (see [13, 14]) for small teams, experiments were 
run with at most 16 team-mates. Given the lightweight nature of otto, it was 
natural to try to use it within large teams of (simple) agents. But the size of 
the data that each agent has to store within the first implementation of otto 
grows with the square of the size of the team and each agent is supposed to 
know beforehand the composition of the team. These limitations become serious 
when one moves from a few agents to several dozens or hundreds of simple 
agents with low processing power, like for example micro-aerial vehicles (MAV) . 
Therefore, we have extended otto to lotto. In lotto, agents can discover 
the existence of their team-mates by overhearing and can limit their memory 
usage by “forgetting” the team-mates they are not directly interacting with 
and whose information is the oldest. The point of lotto is that forgetting 
some of the information that should be kept within otto brings few additional 
incoherence in the team and allows to scale up the framework to large teams 
while maintaining low resource usage. 

This paper is organized as follows: Section 2 describes the components and 
concepts supporting otto and therefore LOTTO as well, it can be safely skipped 
by the reader already familiar with this material. The results of a series of 
experiments conducted on LOTTO are described and discussed in Section 4 and 
finally some related work is discussed in Section 5. 



2 The Group Formation Framework OTTO 

Let r = {A\, A 2 , . . . , A n } represent the team, composed of agents A,. The set 
of groups forms a partition 1 of the team i.e. at any moment an agent belongs 
to a group, even if the agent is alone. At any moment, it is tlreorically possible 
to draw a map n* £ T of the team’s organization as in the example of Figure 1 
which shows the state of a four-member team r. The gray boxes correspond to 
the different groups while the o’s indicate the leaders of the groups. 

But such a global information is not readily accessible. Each agent has its own 
beliefs about the state of the team, i.e. its own map, which can be incomplete 

1 We denote E the set of the partitions of E. By definition we have X € E ii and only 

if: (i) 0 0 A; (ii) E = (hi) V(a ;,y) € X 2 : x ^ y -S- xDy = 0. 
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or even erroneous. So, for each task, an agent A t has only a semi-partition 2 of 
the team in groups. If we denote Tr Ai Aj’s map of the team’s organization, we 
have n Ai £ r Ai . 

Rather than striving to construct a global 7r* from the various 7r Ai , the 
objective of the framework is to obtain a set of Tt Ai as coherent as possible. 
In Figure 2, we can see two types of incoherences. First, no-one except A\ itself 
knows that it has created its own group, this has no serious consequence as no- 
one expects cooperation from A\. Second, we can see that A 2 seems to have left 
A^s group and to have created its own, but neither A 4 nor A 3 are informed about 
that. This incoherence is serious as A 3 and A 4 might expect some cooperation 
from A 2 . In short: inter-group incoherence is considered harmless as opposed to 
intra-group incoherence. 



2.1 Propositional Attitudes 

In order to allow the creation and evolution of the groups, we have identified 
three classes of Propositional Attitudes (PAs) [8] : positive (creating a group or 
expressing its desire to join one), negative (dissolving a group or quitting one) 
and informative (asking the other agents for information about the groups or 
giving such information) . By combining these classes with the status of the sender 
(being a leader or not) we have determined six types of PAs for the framework: 

AFF (positive and informative PA, leader) is used by an agent to assert the 
creation a group, thereby becoming publicly its leader and sole member i.e. 
that agent becomes liable to receive requests from other agents to join this 
group. Consecutively, the leader also uses AFFs to express its beliefs about 
the evolving composition of its group. Note: a group is entirely identified by 
its leader and its creation date. 

BEL (informative PA, non-leader) is used by an agent to express its beliefs about 
a group while not being its leader. 

JOIN (positive PA, non-leader) is used by an agent to express its desire to join 
a pre-existing group. It implies that the agent is alone i.e. sole member and 
implicit leader of its own non-public group. 

2 For a given set E and an element a £ E, we define E a the set of the semi-partitions of 
E covering a. By definition, we have X £ E a if and only if: (i) 0 ^ A"; (ii) (J x x C 
E; (iii) V(x, y) £ X 2 : x y x C\ y = 0', (iv) 3x £ A, a £ x. 
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CHK (informative PA) is used by an agent to request information from the other 
agents about a particular group. A belief is expressed and the other agents 
are expected to confirm or infirm it with BELs/AFFs. 

BRK (negative PA, leader) is used by a leader to signify the dissolution of its 
group. All the members (including the leader) are immediately considered 
to be alone. 

QUIT (negative PA, non-leader) is used by a non-leader member of a group to 
quit it. This agent is immediately considered to be alone. 

These PAs are instantiated in the form: 

PA(A e , T e , (Al, Tl), {(Aj, di ), . . .}) 

where A e is the sender and T e is the date of emission. (Al,Tl) identifies the 
group, Al is the leader and Tl is the date of creation. If the PA deals with 
lone agents, it is instantiated with 0 instead of ( A L ,T L ). {(Aj, di), . . .} is the 
set of the members with the latest dates at which they were acknowledged as 
being members (these can only correspond to the emission date of a previous 
PA, see Section 2.3). These instantiated PAs are the only kind of message that 
the agents use to communicate about groups within the team. 

2.2 Overhearing 

According to Kraus [12], “In non- structured and unpredictable environments, 
heuristics for cooperation and coordination among automated agents, based on 
successful human cooperation and interaction techniques, may be useful. ” There- 
fore let us consider overhearing. This phenomenon is studied in ergonomy [10] 
and corresponds to the fact that human subjects working in a common envi- 
ronment have a tendency to intercept messages that are not clearly directed to 
them and to use this information to facilitate group work. 

The framework that we propose uses broadcast communication, so each mes- 
sage (an instantiated PA) is potentially received by every other team member. 
When receiving (or intercepting) a message, an agent A,; uses it to update its 
knowledge n Ai about the organization of the team even if Aj is not directly 
concerned by this message. This mechanism is described in Section 2.3. This is 
how we mirror overhearing in teams of artificial agents. The agent Aj monitors 
the activity and organization of the whole team though it may have no direct 
use for this information in its current activity. This information can have three 
different uses: (1) it may effectively concern the current activity of Aj; (2) it 
may be used in the future by Aj if it wishes to move from its current group to 
another; (3) it may be transferred to other agents if these agents seem to have 
inexact beliefs. 

In this paper, we present overhearing in the light of the group formation 
framework, but this framework is designed to be part of a complete cooperation 
architecture in which the agents overhear task information in addition to co- 
operation information. For example, information obtained through overhearing 
may motivate an agent to quit its group to join another in which it may be more 
useful to the team. 
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2.3 Group Tracking 

In order to build a representation Tt Ai of the organization of the team at time 
t, an agent Aj has potentially at its disposal all the messages (instantiated 
PAs) it has received up to time t. Rather than storing all these messages we 
have implemented a mechanism that updates a map with each message re- 
ceived. This is the reason why the dates dj are present in the instantiated PAs: 
PA(A e , T e , (Ai, Tl), {(Aj, dj ), . . .}). Updating the map 7r Ai with a message is 
based on the semantics of the PA and dates comparisons in order to determine 
whether the information in the message about a given agent Aj is relevant or 
not. For each agent Aj, the latest date Dj at which Aj was known to be part of 
a group (or to be alone) is stored, updated with the dj present in the messages 
(when greater than the current value of Dj) and used to instantiate the messages 
further emitted by Aj. 

In addition, for each agent Aj and for each known group 7, the latest date 
Dj at which Aj was not a member of 7 is stored. D J is useful for example if a 
message indicates that Aj is member of 7 at dj and if Dj > dj > Dj. In this 
situation, Aj is not considered a member of 7 even though dj > Dj. 

The dj present in the instantiated PAs mean that there exists an objective 
evidence that at the date dj, agent Aj was a member of the group concerned 
by the message. The dj correspond to the emission date of previous messages 
emitted by Aj or its leader. When the leader emits a message stating that Aj is in 
its group, uses the emission date of Aj’s latest message saying so, and conversely 
for Aj which uses the emission date of its leader’s latest AFF. For a given group 
and a member Aj, the first dj is present in the AFF that acknowledges Aj as 
a member and is equal to the emission date of Aj’s JOIN. See Figure 3 for an 
example. 



JOIN(A 2 @L44 ii 6U(A 1 ,7),(A 3 ,8)}) 

AFF(Ar, 12, (A 1; 6 ), {(Ai/rj^@, (^ 3 , 8 )}) 



BEL(A 3 , 13, (Ai, 6 ), {(Ai, 12), (A 2 (u\, (A 3 , 12)}) 



BEL(A 2 , 14, (Ax, 6 ), {(Ax, 12), (A 2 , 12), (A 3 , 12)}) 



Fig. 3. Example of evolution of the dates dj. The emission date of A 2 5 s JOIN is used 
by Ai as a “certainty date” of A 2 membership. 



An incremental mechanism based on dates comparison allows to forget the 
messages previously received and to get the correct membership for any agent 
Aj, given the messages received in the past. The crucial hypotheses are that the 
groups form a semi-partition of the team (and particularly that an agent cannot 
be in two groups at the same time) and that the agents are rational: they cannot 
emit simultaneously two incoherent messages. 
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Under these hypotheses, we do not need to store the messages received up to 
time t to deduct to which group Aj belongs. It is sufficient to store and incre- 
mentally update information of the form: (7 j, Dj , {Dj . . .}) even if the messages 
are received in a “wrong” order ( 7 j is the current group of Aj and the dates 
Dj,{Dj . . .} were explained earlier). For more details and a proof, please refer 
to [14], 

3 LOTTO: OTTO for Large Teams 

The key idea behind lotto is that an agent Ai can “forget” information about 
another agent Aj if it has been a long time since it last received information 
about Aj. That is: the less often one interacts with someone the less you need 
to keep track of its whereabouts. So, as we consider that our agents have lim- 
ited resources available to track the organization of the team, they must “make 
room” in their memory in order to accommodate the most recent and relevant 
information. The basic information units that we consider are the structures 
Sj = (jj,Dj, {Dj ...}). In order to compare the relative interest of two struc- 
tures Sj ± , Sj 2 , we define four classes by order of importance: 

1 . (S)} the information concerning the agent A; itself! 

2. (Sjly,' = 7 i} the agents belonging to the same group as Aj. This is crucial 
in order to preserve intra-group coherence. One can note that therefore, the 
size of the memory sets a bound to the maximal size of a group. 

3. {Sj\lead{^j) = Aj} the agents leading their own group. These are the agents 
that At petitions if it must join another group. 

4. The rest of the team. 

Within a given class, Sj 1 is preferred to Sj 2 if Dj 1 > Dj 2 . Therefore one 
can sort the structures present in memory and keep only a limited number of 
structures. 

We have seen in the preceding section that otto allows an agent to track a 
given team-mate Aj by updating a structure Sj = ( 7 j,Dj,{Dj . . .}) where the 
dates Dj correspond to the latest date at which Aj was known not to be part of 
the group led by Aj (if such a group exists). In otto the dates Dj are kept for 
all potential leaders (which is the reason behind the memory usage growing with 
the square of the size of the team). As lotto keeps track of a limited number 
of agents, the number of potential leaders is bounded. Therefore, by limiting the 
number of the structures Sj, one can bound the memory used by an agent to 
track the organization of the team. 

4 Experiments 

We have conducted a series of experiments on lotto in order to validate our 
hypothesis that memory-bounded agents can use this extension of otto in large 
teams with only a small effect on overall team coherence. 
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4.1 Experimental Setup 

As in [13, 14], the decision processes of several agents are simulated with stochas- 
tic Finite State Machines (FSMs) driven by Poisson laws that enable them to 
periodically create, join, quit or check groups. In order to obtain “realistic” 
communication conditions, agents are simulated as moving randomly in a 2- 
dimensional space and having a limited range of communication. The agents 
have a constant speed of 5m.s _1 , change heading following a Poisson law with a 
time constant of 10s and they evolve within a square 1 km x lfcm area. Figure 4 
illustrates this. 




Fig. 4. Trajectories of four agents during a 10 minutes run. 



The main parameters of the simulations are: 

— range: the communication range of the agents - within {50m, 200m, 350m, 
500m}. 

— N: the number of agents present in the simulation, within {30, 60, 90, 120}. 

— Tchk: the average time interval between two CHKs emitted by an agent - 
within {12.5s, 25s, 37.5s, 50s}. 

— memory : the ratio between the numbers of structures S 3 stored in memory 
and N - within {0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875, 1.0}. 

We average two values to evaluate the behavior of lotto during 10-minute 
simulation runs: 

— cohesion is the ratio between the number of agents that are members of a 
group (while not being its leaders) and N. To be considered member of a 
group 7, the agent and the leader of 7 must believe so. 
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Table 1 . Correspondence range - density. 



range 


density 


100m 


3% 


300m 


22% 


500m 


49% 



— incoherence is calculated at a given moment by counting the number of cases 
where a leader wrongly believes that an agent is a member of its group or 
where an agent wrongly believes to a member of a group. This number is 
then divided by N to obtain an average number of actual incoherences per 
agent. 

For the sake of comparison with previous experiments [13, 14] table 1 presents 
the correspondence between the range of communication and the average reli- 
ability of the communication (noted a in our previous work). This correspond 
to the average density of the communication graph i.e. the ratio between the 
number of pairs of agents that can communicate and the number of edges of 
the complete graph of size N i.e. N(N — l)/2. In our previous experiments, the 
default density was 50%, we del here with much lower values. 

4.2 Results and Analysis 

Here are some of the results of the numerous simulation we have run. These 
results are presented as 3-dimensional plots representing the effects of the varia- 
tion of N, range and Tchk, and this for several values of memory so as to show 
whether using LOTTO degrades team performance compared to otto (which is 
equivalent to lotto with memory = 1). 

Variations of range 

Figures 5 and 6 show the evolution of cohesion and incoherence for different 
values of range and memory. On one hand, we can note that globally, range 
has a small impact on cohesion except when CHKs are emitted very often and 
with a very small memory (figure 5), which is due to a kind of saturation of 
the agents: information flows very quickly and they only have a small memory, 
so the more information they get (the greater the range) the more “confused” 
they are. On the other hand we can say that range has an important impact 
on incoherence, as more communication possibilities reduce the occurrence of 
“misunderstandings” within the team. 

Concerning the effect of memory we observe that reducing the size of the 
memory has an impact on the capability of the team to form (large) groups, as 
it reduces cohesion. But we must note also that it has little to no impact on 
incoherence. 

Variations of N 

Figure 7 illustrate the evolution of cohesion and incoherence for different values of 
N and memory with range = 100m and Tchk = 50s. We can note that memory 
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Fig. 5. Cohesion and incoherence variations for N = 120 and Tchk = 5s. 



has no effect here, neither on cohesion nor on incoherence. The only observable 
variation is the increase of cohesion with N . As we have range = 100m, with 
bigger values of N comes more possibilities of communication for the agents, and 
therefore a better cohesion. 

Figure 8 illustrate the evolution of cohesion and incoherence for different 
values of N and memory with range — 500m and Tchk = 50s. On one hand, 
the increased range gives very good communication possibilities to the agents, 
therefore the variations N have not the effect on cohesion seen on figure 7. On 
the other hand, with such good communication possibilities, the agents can make 
good use of their memory to increase cohesion within the team. Though, we can 
observe that small values of memory lead to an increase of incoherence. This 
phenomenon can be explained by the fact that each agent can interact with a 
large number of their teammates, and when this number is larger than their 
memory capacity, important information is lost, which leads to incoherence. 

Variations of Tchk 

Figures 9 and 10 illustrate the evolution of cohesion and incoherence for different 
values of Tchk and memory. Here we can observe that in general Tchk has 
no notable impact neither on cohesion nor on incoherence. This corresponds 
to the results already obtained in [13, 14] and is in accordance with one of the 
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Fig. 6. Cohesion and incoherence variations for N = 120 and Tqhk = 30s. 



central idea behind otto (and therefore lotto) i.e. that by using overhearing 
it is possible to obtain a good coherence in the team without having the agents 
constantly checking each other for membership. 

As in preceding section dealing with the variation of N, the effect of the 
variation of memory are only observable for very good communication possibili- 
ties, in which case a large memory allows a good cohesion while a small memory 
brings some incoherence. 

4.3 Discussion 

We can make two global conclusions on the behavior of LOTTO out of these 
experimental results: (1) limiting the size of the memory has no impact on the 
global incoherence of the team except in extreme cases; (2) this limitation of 
memory seems to affect the cohesion of the team by preventing the agents to 
form large groups. This second point seems to be a serious drawback for lotto, 
but we must notice that the group-formation behavior of the agents in these 
simulations is very rudimentary (based on very simple FSMs). Therefore, we 
believe that it is possible to design group-formation behaviors for the agents 
that would alleviate this limitation of lotto. 
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Fig. 7. Cohesion and incoherence variations for range = 100m and Tchk = 50s. 



5 Related Work 

The problem of team or group formation has been widely dealt with from several 
points of view. Among others, game theory has been used by Sheory and Kraus 
[15] to evaluate and divide the payoff of cooperation. Cohen et al. [5] stress the 
importance of the joint mental state or commitment of the participants to dif- 
ferentiate collaborative or joint action from mere coordination. Tidhar et al. [19] 
propose that a team leader makes a selection among pre-compiled teams, using 
the commitment of the potential team members to guide its choice. Dignum 
et al. [6] use formal structured dialogues as opposed to fixed protocols like con- 
tract nets [16]. This allows them to prove that a certain dialogue has a spe- 
cific outcome. Let us focus on recent literature relative to group formation and 
overhearing. These study seldom take into account communication possibilities 
between agents either static or fully dynamic as in otto/Lotto. Furthermore, 
a majority of research in the field consider only 1-to-l communication between 
agents, while we focus on broadcast communication. 

5.1 Group Formation 

Group and coalition formation has been recently studied in dynamic contexts, 
i.e. contexts where agents, tasks, and the environement may change over time. 
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Fig. 8. Cohesion and incoherence variations for range = 500m and Tchk = 50s. 



[11] deals with coalition formation in dynamic environments. The DCF (Dy- 
namic Coalition Formation) problem is such that: agents may enter or leave the 
coalition formation process at any moment; the set of tasks to be accomplished 
and the resources may change dynamically; the information, network and envi- 
ronment of each agent may change dynamically. 

DCF-S, a simulation dedicated to DCF, is proposed. In this framework, each 
agent can adapt its own decision-making to changes in the environment through a 
learning process. Special agents, the World Utility Agents, maintain information 
about their registered agents. Each coalition is represented by a coalition leading 
agent (CLA) and each agent is a CLA for the coalition it has formed for one of 
its goals. 

Once the CLA has determined the set of goals to be accomplished, it simu- 
lates the formation of coalitions: possible candidates are determined, then coali- 
tions of limited sizes are simulated by randomly adding or removing candidates 
and assessing both the contribution and risk of each possible member. The sim- 
ulation goes on when changes in the environment occur. Then, the negotiation 
phase determines binding agreements between the agents within the coalitions. 
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Fig. 9. Cohesion and incoherence variations for range — 100m and N = 120. 



The problem of coalition formation in time and resource - constrained and 
partially known environments is dealt with in [17,18]. Sub-optimal coalitions 
are searched for, through a two-step process: (1) initial coalition formation is 
performed by the initiating agent on the basis of its knowledge about its neigh- 
bours’ (numerical) utilities for the task to achieve, which represent their skills 
for the task and their past and current relationships with the initiating agent. 
Negotiation is started only with top-ranked candidates; (2) coalition refinement 
is based on 1-to-l negotiations to know which agent is actually willing to help 
and to make constraints and commitments clear. The application is multisensor 
target tracking, where each sensor is controlled by an agent. Several algorithms 
have been tested for resource allocation, to search for a good compromise be- 
tween speed of coalition formation with incomplete information, robustness and 
flexibility of coalition. 

[4] build coalitions with the assumption that the agents’ utilities are not com- 
parable nor have to be aggregated, and use dynamic reconfiguration to update 
coalitions when changes occur. The aim of the protocol is to find a Pareto- 
optimal solution, i.e. a solution where an agent’s utility cannot be increased 
without decreasing another agent’s. The algorithm is based on the fact that the 
agents involved in the negotiation pass possible sets of coalitions round, each 
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Fig. 10. Cohesion and incoherence variations for range = 500m and N = 120. 



agent sorting the sets according to its preferences, if they are at least equiv- 
alent to a reference situation (the modified previous optimal situation in case 
of dynamic reconfiguration). Implementation is performed on a class allocation 
problem with preferences set on agents (teachers and students). Results (number 
and sizes of messages, number of coalition sets evaluated) are given for a small 
number of agents (4) and tasks (4 to 8) with different heuristics for improving 
the computational complexity of the protocol. 

5.2 Overhearing 

Multi-party communication (as opposed to dyadic communication) and specifi- 
cally overhearing is quite new in the agent literature [7]. Most often, overhearing 
agents are special agents whose role is to track dialogues between a group of 
agents and provide information either to some agents inside the group who have 
subscribed to the service, or to external agents. An overhearing agent has to 
classify and assess the overheard communication according to criteria that are 
useful to the end-users. 

The possibility of broadcasting messages to a wide audience including over- 
hearing agents unknown to the sender is studied by Busetta et al. [3, 1, 2], In [1], 
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communication is based on channeled multicast , with the following characteris- 
tics: many channels can co-exist; each channel is dedicated to one theme ; each 
message has a specific destination (one agent or one group of agents); each agent 
can tune into any channel. Therefore messages are not necessarily routed to their 
best destination (they can be heard by everybody — overhearing ) and delivery 
is not guaranteed. Overhearing enables unplanned cooperation. The communi- 
cation architecture implements overhearers, e.g. agents whose role is to listen to 
several channel and forward relevant messages to subscriber agents (event if they 
are not intented to receive the messages). Communication between overhearers 
and their subscribers is point-to-point. The implementation ( LoudVoice ) focuses 
more on real-time than on reliability of message delivery. Channeled multicast 
improves such protocols as the English Auction Interaction Protocol (reduction 
of the number of messages, increase of the social awareness of the auction’s 
progress through updating of a group belief — all listeners can update their 
data about the other agents) . Only four types of messages are required and the 
states of the agents (auctioneer and participants) are described by simple finite 
state machines. Overhearing agents monitor several parameters on the auction 
and on the participants and may provide the auctioneer or bidders with private 
reports and suggestions. Therefore overhearing is based on communication ob- 
servation and situation monitoring (or “spying”), and is deliberately dedicated 
to information delivery to specific agents. 

The monitoring system Overseer ([9]) focuses on a non- intrusive approach 
to agent team monitoring by overhearing. Overhearing (or key-hole plan recogni- 
tion) is performed by specific agents which are dedicated to monitoring, through 
listening the messages (routine communication) exchanged by the team agents 
with each other. This avoids team agents regularly communicating their states 
to the monitoring agents. Monitored agents are assumed to be truthful in their 
communication to their teammates and not to deceive the monitoring agent nor 
to prevent it from overhearing. When agents are monitored as independent enti- 
ties (a separate plan recognition is maintained for each agent), the belief about 
an agent’s actual state is represented as a probability distribution over variables 
{X t }, {X t } being true when the agent is executing plan X at time t. Evidence 
for belief updating comes with messages identifying either plan initiation or 
termination. When no messages are observed at time t, updating is performed 
thanks to a probabilistic model of plan execution. A markovian assumption is 
made both for incorporating evidence and for reasoning forward to time t + 1 
(both only based on beliefs at time f).The state of the team at time t is the 
combination of the most likely state of the individual agents at time t, but this 
approach gives very poor results. When agents are monitored as a team, the 
relationships between the agents are used, especially the fact that a message 
changes the beliefs on the state of the sender and of the listeners. The coherence 
heuristic is used, i.e. evidence for one team member is used to infer the other 
members’ states, and to predict communications. 
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6 Conclusions and Future Work 

We have presented lotto, an extension to our dynamic overlrearing-based group 
formation framework otto, lotto distinguishes itself from otto by allowing 
the agents to track only the fraction of the team’s organization that is the most 
useful to them and correspond to their memory capabilities. We have run a series 
of experiments on lotto and have observed that by using the same principles 
than in otto but with a bounded memory no additional incoherence appears 
within the team. Nevertheless rudimentary FSM-based mechanisms are not suf- 
ficient in order to form large groups within the team. 

lotto will be at the heart of a fully-fledged subsumption-inspired cooper- 
ation architecture in which the team can be viewed as a multi-bodied entity 
functioning with a distributed architecture which continuously adapts its organi- 
zation within each layer to allow its components to be ready when a particular 
layer gets in control. 
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Abstract. The capabilities for agents in a team to anticipate informa- 
tion-needs of teammates and proactively offer relevant information are 
highly desirable. However, such behaviors have not been fully prescribed 
by existing agent theories. To establish a theory about proactive informa- 
tion exchanges, we first introduces the concept of “information- needs” , 
then identify and formally define the intentional semantics of two proac- 
tive communicative acts, which highly depend on the speaker’s awareness 
of others’ information-needs. It is shown that communications using these 
proactive performatives can be derived as helping behaviors. Conversa- 
tion policies involving these proactive performatives are also discussed. 
The work in this paper may serve as a guide for the specification and 
design of agent architectures, algorithms, and applications that support 
proactive communications in agent teamwork. 



1 Introduction 

Passive communications (i.e., ask/reply) are prevalently used in existing dis- 
tributed systems. Although the ask/reply approach is useful and necessary in 
many cases, it exposes several limitations, where proactive approach may come 
into play. For instance, an information consumer may not realize certain infor- 
mation it has is already out of date. If this agent needs to verify the validity of 
every piece of information before they are used (e.g., for decision-making), the 
team can be easily overwhelmed by the amount of communications entailed by 
these verification messages. Proactive information delivery by the information 
source agents offers an alternative, and it shifts the burden of updating informa- 
tion from the information consumer to the information provider, who has direct 
knowledge about the changes of information. In addition, an agent itself may not 
realize it needs certain information due to its limited knowledge (e.g., distributed 
expertise). For instance, a piece of information may be obtained only through a 
chain of inferences (e.g., being fused according to certain domain-related rules). 
If the agent does not have all the knowledge needed to make such a chain of 
inferences, it will not be able to know it needs the information, not to mention 
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requesting for it. Proactive information delivery can allow teammates to assist 
the agent under such a circumstance. 

In fact, to overcome the abovementioned limitations of “ask” , many human 
teams incorporate proactive information delivery in their planning. In particular, 
psychological studies about human teamwork have shown that members of an 
effective team can often anticipate needs of other teammates and choose to assist 
them proactively based on a shared mental model [1]. We believe this type of 
approaches developed by human teams provides critical evidence for software 
agents to be also equipped with proactive information delivery capabilities. 

Even though several formal theories on agent teamwork have been proposed, 
they do not directly address issues regarding proactive information exchange 
among agents in a team. To do this, “information-needs” should be treated as 
first-class objects, the intentional semantics of acts used in proactive communi- 
cations need to be formally defined, and agents should be committed to these 
acts as helping behaviors under appropriate contexts. 

The rest of this paper is organized as follows. In section 2 we make some 
preparations and define the semantics of elementary performatives in the Shared- 
Plan framework. In section 3 we identify two types of information-needs, and 
propose axioms for agents to anticipate these two types of information-needs 
for their teammates. In section 4 we give the semantics of two proactive perfor- 
matives based on the speaker’s awareness of information-needs, and show how 
agents, driven by information-needs of teammates, could potentially commit to 
these communicative actions to provide help. Potential conversation policies for 
Prolnform and third-party subscribe are discussed in section 5. Section 6 devotes 
to comparison and section 7 concludes the paper. 

2 Preparation 

We use a, (3 , 7 • • • to refer to actions. An action is either primitive or complex. 
The execution of a complex action relies on some recipe, i.e. , the know-how 
information regarding the action. A recipe is composed of an action expression 
and a set of constraints on the action expression. Action expressions can be 
built from primitive actions by using the constructs of dynamic logic: a; /3 for 
sequential composition, a\f3 for nondeterministic choice, pi for testing (where 
p is a logical formula), and a* for repetition. Thus, a recipe for a complex 
action 7 is actually a specification of a group of subsidiary actions at different 
levels of abstraction, the doing of which under certain constraints constitutes 
the performance of 7 . 

Appropriate functions are defined to return certain properties associated with 
an action. In particular, pre(a) and post(a) return a conjunction of predicates 
that describe the preconditions and effects of a, respectively. By I € pre(a) we 
mean / is a conjunct of pre(a). 

We adopt the SlraredPlan theory [2, 3] as the cornerstone of our framework. 
Thus, all actions will be intended, committed and performed in some specific 
context. By convention, C a is used to refer to the context in which a is being 
done, and Constr(C a ) refers to the constraints component of C a . 
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Bel and MB are standard modal operators for belief and mutual belief, re- 
spectively. Three modal operators in the SharedPlan theory are used to relate 
agents and actions: Do(G , a, t, 0) is used to denote that G (a group of agents 
or a single agent) performs action a at t under constraints <9; Commit(A,a,t\, 
t- 2 ,C a ) represents the commitment of agent A at t\ to perform the basic-level 
action a at t% under the context C Q ; and Exec(A,a,t,0) is used to repre- 
sent the fact that agent A has the ability to perform basic-level action a at 
time t under constraints 0. Four types of intentional attitudes were defined. 
Int.To(A,a,t,t a ,C a ) means agent A at t intends to do a at t a in the context 
C a ; Int.Th(A,p, t, t' , C p ) means agent A at t intends that p hold at t' under the 
context C p . Pot.Int.To and Pot.Int.Th are used for potential intentions. They 
are similar to normal intentions (i.e., Int.To and Int.Th ) except that before re- 
ally adopting them, the agent has to reconcile the potential conflicts that may be 
introduced by the potential intentions to the existing intentions. Meta-predicate 
CBA(A , a , R a , t a , 0) means agent A at t a can bring about action a by following 
recipe R a under constraints 0. 

Grosz and Kraus proposed several axioms for deriving helpful behaviors [2, 
3]. The following one simplifies the axiom in [3] without considering the case 
of multiple-agent actions (we assume communicative acts to be examined are 
single-agent actions) and the case of action-intention conflicts. 

Axiom 1 \/A,p,t,f3,tp,t' > tp,C p - 

Int.th(A,p,t,t' ,C P ) A ->Bel(A,p,t ) A lead(A, (3,p,t,tp,0p) => 

Pot.Int.To(A , (3, t, tp, 0p A C p ), where 
lead(A, (3,p, t, tp , 0p) = Bel(A , 3Rp ■ CBA(A, {3, Rp, tp, 0p)),t ) A 

[Bel(A, (Do(A, (3, tp, 0p) =>p),t)V Bel(A, Do(A, (3, tp, 0p) => 

[3B, a, R a ,t a ,t"- (t a > tp) A (t a > t") A CBA(B, a, R a , t a , 0 q )A 
Pot.Int.To(B , a, t" , t a , 0 a )/\ ( Do(B , a, t a , 0 a ) =>£>)],£)]• 

Axiom 1 says that if an agent does not believe p is true now, but has an 
intention that p be true at some future time, it will consider doing some action 
/ 3 if it believes the performance of (3 could contribute to making p true either 
directly or indirectly through the performance of another action by another 
agent. 

Hold(p,t) is used to represent the fact that p is true at time t. Note that 
Hold is external to any rational agents. It presupposes an omniscient perspec- 
tive from which to evaluate p. On the other hand, assume there exists an om- 
niscient agent G, then Hold(p,t ) = Bel(G,p,t). Hold will be used only within 
belief contexts, say Bel(A, Hold(p,t),t), which means agent A believes from 
the omniscient’s perspective p is true. Since omniscient is always trustable, 
Bel(A, Hold(p,t),t) => Bel(A,p,t), but not vice versa. 

We define some abbreviations needed later. Awareness (aware) 1 , belief con- 
tradiction (CBel) between two agents (from one agent’s point of view), and 
wrong beliefs (WBel) are given as: 



1 We assume belief bases allow three truth values for propositions. 
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aware(A,p , t ) = 
unaware(A,p,t ) = 
CBel{A , B,p, t) = 

W Bel(A,p 1 1 ) = 



Bel(A,p,t) V Bel(A,->p,t), 

~^aware(A,p, t), 

( Bel(A,p , t) A Bel (A, Bel(B , -i p, i), i))V 
( Bel(A , -i p, t) A Bel{A , Bel(B,p , i), t)), 

( Hold(p , i) A Bel{A , -ip, i)) V (Holdup, t) A Bel(A,p , i)). 



In the following, let TA be an agent team with finite members. The proposal 
put forward in the SlraredPlans theory is to identify potential choices of action 
(ultimately represented in terms of a Pot.Int.To) as those which would reduce 
the cost or the resources required to perform actions intended by a teammate. 
For the purpose of this paper, we will only focus on barriers to actions rooted 
in lack of information regarding the preconditions of the actions. 



2.1 Reformulate Performative-as- Attempt 

Following the idea of “performative-as-attempt” [4, 5] , we will model the in- 
tentional semantics of proactive performatives as attempts to establish certain 
mutual beliefs between the speaker and the addressee (or addressees). In order 
to do that, we first need to reformulate the concept of Attempt within the frame- 
work of the SharedPlan theory. Then, the semantics of Inform and Request are 
given in terms of attempts, which serves partially to validate our approach of 
encoding “performative-as-attempt” in the SharedPlan framework. 

Definition 1. Attempt(A,e, P,Q,C n ,t,ti) — [~<Bel(A,P,t ) A 
Pot.Int.Th(A, P, t, t\, C n ) A Int.Th(A, Q , t, t 1; ->Bel(A, P, t) A C n ) A 
Int.To{A , e, t, t, Bel(A,post(e) =>■ Q,t) A Pot.Int.Th(A , P , t, ti, C n )))]?; e. 

Here, P represents some ultimate goal that may or may not be achieved by 
the attempt, while Q represents what it takes to make an honest effort. The 
agent has only a limited commitment (potential intention) to the ultimate goal 
P, while it has a full-fledged intention to achieve Q. More specifically, if the 
attempt does not achieve the goal P, the agent may retry the attempt, or try 
some other strategy or even drop the goal. However, if the attempt does not 
succeed in achieving the honest effort Q , the agent is committed to retrying 
(e.g., performing e again) until either it is achieved ( A comes to believe P) or 
it becomes unachievable (t 1 comes) or irrelevant (the escape condition C n no 
longer holds) [4, 6]. Thus, the Attempt would actually be an intent to achieve Q 
by performing e while the underlying intent was to achieve P. Of course, P and 
Q may refer to the same formula. 

For example, agent A may desire that Bel(B } /, t) under conditions that agent 
A does not believe that B believes I. While Bel(B,I,t) (P in this case) may be 
unachievable for A, MB({A, B}, Bel(B, Bel{A , J, t),t')) ( Q in this case) can be 
achieved by exchanging appropriate messages with B. In case of communication 
failure in establishing the mutual belief, A will retry until either the mutual belief 
is achieved or C n no longer holds or the deadline t\ comes. Here e may refer to 
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a sequence of send , the act of wrapping the message in a wire language and 
physically sending it. When communication is reliable and sincerity is assumed, 
one send may suffice. 

According to the speech act theory [7], every speech act has an utterance 
event associated with it. For the purpose of this paper, we simply assume all the 
utterance events are single-agent complex actions, for which each agent has full 
individual recipes. For instance, when the honest goal of a performative is to 
establish certain mutual beliefs, the recipe for the corresponding e may involve 
negotiations, persuasions, failure-handling, etc. 

The semantics of elementary performatives are given by choosing appropriate 
formulas (involving mutual beliefs) to substitute for P and Q in the definition 
of Attempt. As in [8], the semantics of Inform is defined as an attempt of the 
speaker to establish a mutual belief with the addressee about the speaker’s goal 
to let the addressee know what the speaker knows. 

Definition 2. Inform(A,B,e,p,t,t a ) — (t < t a )?; Attempt {A, e, 
MB({A,B},p,t a ), 3t" • (t < t" < t a ) A MB({A,B},ip,t"),C p ,t,t a ), where 
ip = 3 tb ■ (t" < t), < t a ) A Int.Th(A , Bel(B , Bel(A,p , t),tb ), t, tf,, C p ), 

C p = Bel(A,p,t ) A Bel(A,unaware(B,p,t),t). 

When communication is reliable and agents trust each other, it’s easy to 
establish the mutual belief about ip required in the honest goal of Inform: 
agent B believes ip upon receiving a message with content ip from agent A\ and 
A knows this, and B knows A knows this, and so on. 

A request with respect to action a is defined as an attempt of the speaker to 
make both the speaker and the addressee believe that the speaker intends that 
the addressee commit to performing the action a [5]. 

Definition 3. RequestfA , B, e, a, t, t a , O a ) = (t < t a )?; Attempt (A, e, 
Do(B,a,t a ,e a ), 3 1" ■ (■ t < t" < t a ) A MB({A,B},ip,t"),C p ,t,t a ), where 
ip = 3tb < t a • Int.Th(A , Int.To(B, a, t^, t a , C p ),t, tb, C p ) , 

C p = Bel(A , 3R a ■ CBA(B , a, R a ,t ai O a ),t ) A 
Int.Th(A, Do(B , a , t a , 6> a ), t, t a , O a ). 

The Request means that agent A at t has an attempt where (1) the ultimate 
goal is for B to perform a at t, a , and (2) the honest goal is to establish a mutual 
belief that agent A has an intention that agent B commit to performing a, all 
of the above being in appropriate contexts. 

According to the definition, agent A would be under no obligation to inform 
B that its request is no longer valid when A discovers that C n on longer holds. 
In [9] Smith and Cohen defined another version of Request in terms of a PW AG 
(persistent weak achievement goal) rather than an intention. That means, upon 
discovering that the goal has been achieved or become impossible, or that C p is 
on longer true, agent A will be left with a persistent goal to reach mutual belief 
with B , which will free B from the commitment towards A regarding a. Rather 
than introducing a counterpart of PW AG into the SlraredPlan framework, we 
prefer to encode such team-level obligations using an axiomization approach by 
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introducing an axiom stating that any agent intending others to be involved 
in a team activity should also adopt an intention to release those agents from 
the obligations whenever the intentional context no longer holds. The axiom is 
omitted here for space limit. 

The semantics associated with the receipt of a Request is a bit involved. In 
addition to realizing that the sender wishes him/her to commit to the action, 
the receiver can make certain deductions based upon knowledge of the semantics 
of Request. In particular, the receiver can deduce that the sender believes that 
there is a recipe the receiver could be following that would lead the receiver to 
bring about a. Note that the Request does not indicate which recipe the receiver 
should follow, only that the sender believes there exists one. This is sufficient, 
though it does not guarantee that the receiver will actually perform a. If the 
receiver is not directly aware of such a recipe, it could lead the receiver to initiate 
a search for an appropriate recipe. If the receiver cannot find one as the sender 
expected, the receiver can discharge himself from the obligation and letting the 
sender know the reason. 



3 Information Needs 

For any predicate symbol p with arity n, it will be written in the form p{lx, c), 
where lx is a set of variables, c is a set of constants in appropriate domains, and 
the sum of the sizes of the two sets is n. We start with the identifying reference 
expression (IRE), which is used to identify objects in appropriate domain of 
discourse [10]. IRE is written using one of three referential operators defined 
in FIPA specification. ( iota lx p(lx,c)) refers to “the collection of objects, 
which maps one-to-one to lx and there is no other solution, such that p is 
true of the objects”; it is undefined if for any variable in lx no object or more 
than one object can satisfy p (together with substitutions for other variables). 
( all lx p(lx,c)) refers to “the collection of sets of all objects that satisfy p, 
each set (could be an empty set) corresponds one-to-one to a variable in lx” . 
( any lx p(lx,cj) refers to “any collection of objects, which maps one-to-one to 
lx , such that p is true of the objects”; it is undefined if for any variable in lx no 
object can satisfies p (together with substitutions for other variables). We will 
omit operator any if possible. Hence, expressions of form ( any lx p{lx , c)) can 
be simplified as p(lx, c). 

Information is defined in WordNet Dictionary as a message received and 
understood that reduces the recipient’s uncertainty. We adopt the definition 
prescribed in the Open Archival Information System (OAIS) [11]: information 
is “any type of knowledge that can be exchanged, and it is always represented 
by some type of data”. Throughout this paper, we deal with two types of infor- 
mation: factual information and referential information. Factual information is 
represented as a proposition (predicate with constant arguments), and referential 
information is represented in terms of a special predicate Refer(ire, obj ), where 
ire is an identifying reference expression, and obj is the result of the reference 
expression ire evaluated with respect to a certain theory. 
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In the following we will use / to represent the information to be communi- 
cated: when / refers to a proposition, the sender is informing the receivers that 
the predicate is true; when I refers to Refer{ire,obj), the sender is informing 
the receivers that those objects in obj are what satisfy ire evaluated with respect 
to the sender’s belief base. 

Now we come to the concept of information-needs. An information-need may 
state that the agent needs to know the truth value of a proposition. For instance, 
suppose a person sends a query Weather {Cloudy, Today) to a weather station. 
The weather station will realize that the person want to know, at least literally, 
whether today is cloudy 2 . More often than not, an agent wants to know the 
values of some arguments of a predicate, where the values could trusify the 
predicate. For example, a person may send a query Weather {lx, Today) to a 
weather station, this will trigger the weather station, if it’s benevolent, to inform 
the person about the (change of) weather conditions whenever necessary. 

Thus, corresponding to information, an expression for information-needs may 
also be in one of two forms: described either as a proposition, or as a refer- 
ence expression. In what follows N is used to refer to a (information) need- 
expression, pos{N) {ref{N)) is true if TV is a proposition (reference expression). 
An information-need consists of a need-expression, an information consumer 
(needer), an expiry time after which the needs is no longer applicable, and a 
context only under which the needs is valid. To combine them together, we in- 
troduce a modal operator InfoNeed{A , N, t, C n ) to denote information-needs. 
In case that N is a proposition, it means that agent A needs to know the truth 
value of N by t under the context C n 3 \ in case that A” is a reference expression, 
it means agent A needs to know those objects satisfying the reference expression 
N. Making the context of information-needs explicit not only facilitates the con- 
version from information-needs of teammates to intentions to assist them, but 
also enables the context to be included in need-driven communicative actions. 
The properties of InfoNeed are omitted here. 

The most challenging issue in enabling agents to proactively deliver infor- 
mation to teammates is for them to know the information-needs of teammates. 
Agents can subscribe their information-needs from other teammates. In this 
paper however, we will focus on how to anticipate potential information-needs 
based on the SlraredPlans theory. 



3.1 Anticipate Information-Needs of Teammates 

We distinguish two types of information-needs. The first type of information- 
need enables an agent to perform certain (complex) actions, which contributes 
to an agent’s individual commitments to the whole team. We call this type 
of information-need action-performing information-need. The second type of 
information-need allows an agent to discharge itself from a chosen goal. Knowing 
such information will help an agent to give up achieving an impossible or irrel- 

2 Refer to [12] for indirect speech acts. 

3 In such cases, InfoNeed{A,p,t,C n ) is equivalent to InfoNeed{A,-'p,t,C n ). 
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evant goal. Thus, we call this type of information-need goal-escape information- 
need. We first define a generated set. For any action a, let Needs(a) be a set of 
need-expressions generated from pre(a ): 

1. p £ Needs(a), if p £ pre(a) is a proposition; 

2. ( any lx p(lx)) £ Needs(a), if p € pre(a) is of form p(lx) 4 . 

Axiom 2 (Action-performing Information-Need) 

VA, B £ TA, a, C a ,t, t' > t\/N £ Needs(a)- 

Bel(A , Pot.Int.To(B 1 a , t, t\ C a ),t) => Bel(A , InfoNeed(B, N, t ' , C n ),t), where 
C n = C a A Pot.Int.To(B,a,t,t' ,C a ). 

Axiom 2 characterizes action performing information-needs, which states that 
agent A believes that agent B will need information described by N by t' under 
the context C n , if A believes that B is potentially intending to perform action 
a at time t' . The context C n of the information-need consists of C a and B ' s 
potential intention to perform a. 

Lemma 1. \/A,B £ T A^^a^Ccf^Oa^t^t' > t,t" > t'MN £ Needs(a)- 

Bel(A , Int.Th(B , <p, t , t " , C^), t)A Bel(A , -<Bel(B, 4>, t),t ) A 

Bel(A , Lead(B , a, </>, t', t, 0 a ),t) 3C„ • Bel (A, InfoNeed(B , A, f', C n ),t). 

Proof. Follows directly from axiom 1 and 2. 

Similarly, let Needs(C) be the generated set of need-expressions from a set 
C of predicates. Axiom 3 specifies goal-escape information-needs. 

Axiom 3 (Goal-escape Information-Need) 

VA, B £ TA, </>, CV, f, t' > tVA £ Needs(C 0 )- 

Bel(A, Int.Th(B, (f>,t,t r jCc/)),^ Bel(A 1 InfoNeed(B,N,t',C n ),t), where 
C n = C^Alnt.Th(B^,t,t',C^). 



Axiom 3 states that if agent A believes that agent S has a goal towards f>, 
it will assume B will need information described by N , which is generated from 
the context of B's intention. The context of the information-need consists of Cy, 
and B's intention. 

By reflection, a rational agent should be able to know its own information- 
needs when it intends to do some action but lacks the pre-requisite information. 
In case that A and B in Axiom 2 and 3 refer to the same agent, they state 
how an agent can anticipate its own information-needs. Being aware of its own 
information-needs, an agent could subscribe its information-needs from an in- 
formation provider. 

4 Depending on domains, need-expressions of the form ( iota lx p(lx)) or 
( all lx p(?x)) can also be generated. For instance, if a is a joint action where 
some doer should be exclusively identified, iota expression is preferred, all expres- 
sion is suitable if all objects substitutable for variables in lx will be needed in the 
performance of a. 
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3.2 Assist Others’ Information Needs 

When an agent knows the information-needs of its teammates by being informed 
or by anticipating, it will consider providing help. 

Let B ^4 be the belief base of agent A, and Ba \= P means p is a logical 
consequence of B^. For any agent A and need-expression A, function info(A, A ) 
returns the information with respect to A evaluated by A: 



info(A,N ) = 



A 

-<N 

Refer(N , Q) 



Refer(N, Q) 
Refer(N, A 1 ) 



if B a |= A, and A is a proposition, 
if B ,4 |= -i A, and A is a proposition, 
if A = ( iota lx p(lx)), 

Q € £ = {9 ■ lx : B/i |= 9 ■ p, 9 is most general 
substitution (mgs)}, and £ is singleton, 
if A = ( any lx p(lx)), 

Q e £ = {0 ■ lx : Ba |= 0 ■ p, 9 is mgs} 7 ^ 0, 
if N = {all lx p(lx )), 

£ = {9 ■ lx : B ,4 |= 9 ■ p, 9 is mgs}, 



info{A , N) is undefined in case that A is a proposition, but neither B^ |= 
A nor B ,4 |= ->A; or in case that A = ( any lx p(lx)) but £ = 0; or in 
case that A = ( iota lx p(lx)) but £ is not a singleton. In case that A = 
( 1 any lx p(lx)) and |A| > 1, a randomly selected element of £ is returned. We 
use defined(info(A,N )) to denote info(A,N ) is defined. 

The following axiom says that, when an agent comes to know another agent’s 
information needs, it will adopt an attitude of intention-that towards “the other’s 
belief about the needed information”. 



Axiom 4 (ProAssist) VA, B e TA, A, C n , t, t' > t- 
Bel(A, InfoNeed(B, N, t' , C n ),t ) => 

[defined(info(A, A)) =*- Int.Th(A, Bel(B, info(A, N),t'),t, t' , Cn)V 
(-'defined(info(A,N)) A pos(N )) =7 Int.Th(A,aware(B,N,t'),t,t',C n )\. 

We use Int.Th rather than Int.To in the axiom because Int.To requires the 
agent adopt a specific action to help the needer, while Int.Th offers the agent 
with the flexibility in choosing whether to help (e.g., when A is too busy), and 
how to help. This axiom relates information-needs with appropriate intentions- 
that. Thus, Axiom 1 and the Axiom 4 together enable an agent to choose appro- 
priate actions to satisfy its own or other’s information-needs. Note that A and 
B could refer to the same agent, that means agent A will try to help itself by 
adopting appropriate intentions. 



4 Proactive Communication Acts 

4.1 Proactive-Inform 

Prolnform. (Proactive Inform) is defined by extending the semantics of Inform. 
with additional requirements on the speaker’s awareness of the addressee’s infer- 
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mation needs. More specifically, we explicitly include the speaker’s belief about 
the addressee’s need of the information as a part of the mental states being com- 
municated. Hence, the meaning of Prolnform is an attempt for the speaker to 
establish a mutual belief (with the addressee) about the speaker’s goal to let the 
addressee know that (1) the speaker knows the information being communicated, 
and (2) the speaker knows the addressee needs the information. 

Definition 4. ProInform(A, B , e, 7, 77 , t, t a , t' , C n ) = [(t a < = info(A, 77 ))]?; 

Attempt(A, e, Bel(B, I, t'), 3 1" ■ ( t < t" < t a ) A MB({A, B}, ip, t"), C p , t, t a ), where 
ip = 3tb ■ it" <fa< t a )A Int.Th(A, Bel(B, Bel(A, I, t ) A 

Bel(A, InfoNeed(B, 77 , t' , C n ),t),fa), t, fa, C p ), 

C p = C n A Bel(A, I, t) A (7 = info(A, 77 )) A Bel(A , InfoNeed(B, 77 , t' , C n ), t ) A 
[pos(N) Bel(A, unaware(B, 7, t),t) V CBel(A, B, 7, t)]. 

Notice that t„ < t', which ensures the Prolnform is adopted to satisfy 
other’s information needs in the future. Also, the context of information-need 
is included as an argument of Prolnform. This context serves in the context 
(C p ) of the speaker’s goal (i.e., intention) to let the addressee know the informa- 
tion. C p justifies the behavior of an agent who uses the communicative action. 
For instance, suppose Prolnform. is implemented in a multi-agent system us- 
ing a component that reasons about the information-needs of teammates and a 
communication plan involving sending, receiving confirmation, and resending if 
confirmation is not received. During the execution of such a plan, if the agent 
realizes the context of the addressee’s information-need is no longer true, the 
agent can choose to abandon the communication plan. This use of context in the 
definition of Prolnform. supports our choice of explicitly including the context 
of information-needs in InfoNeed. 

The semantics of Prolnform has direct impacts on the receivers. By accept- 
ing Prolnform, the addressee attempts to confirm the informing agent that it 
believed the information being communicated at the beginning of the attempt: 
AcceptfB , A, e, I, N,t,t a ,t' ,C n ) = Attempt(B, e, ip, (p, C n , t, fa), 
where ip = MB({A, B}, Bel(B, I, t'),t'), rp = MB({A, B}, Bel(B, I, t),t a ). 

Since the ultimate goal of Prolnform is to let the addressee believe 7 at t', 
the ultimate goal of Accept is also set to establish a mutual belief about 7 at 
t' . Neither may be achievable, because 7 may change between t and t' for both 
Prolnform. and Accept (In such a case, another Prolnform may be adopted). 
In case that 7 persists until t' , the assumption of persistent beliefs will guarantee 
the addressee’s information-need be satisfied. 

The addressee may reject a Prolnform because (1) it knows something 
different from the information received, or (2) it does not think the informa- 
tion is needed. We define the first rejection as Refuselnfo, and the second as 
RefuseNeed. 

RefuseInfo(B, A,e, I , N,t,t a ,t' ,Cn) = Attempt(B,e,ip,ip,Cn,t,t 0 ), 
RefuseNeed(B , A, e, I, N, t, t a , t' , C n ) = Attempt(B, e, <p, <p, C n , t, t a ), where 
iP = MB({A, B}, -,Bel(B, 7, t),t a ), 

<P = MB({A, 73} , Bel(B , InfoNeed(B, N, t’ , C n ),t),t a ). 
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Upon receiving RefuseNeed, the performer of Prolnform might revise its 
belief about the addressee’s information-needs. 

The following properties are obvious. 

Proposition 1 . For any to,t\,t2,t3,t' , where to < ti < t' , to < t% < t3 < t' , 

(1) ProInform(A, B, e, I, N, to, ti, t' , Ci)A Accept(B, A, e , I, N, t2,tz, t' , C2) => 

Bel(A, Bel(B, I, t 2 ), U). 

( 2 ) ProInf orm(A, B, e, I, N, to,t\,t' , Ci)A RefuseInfo(B, A, e', 7 , N, £2,^3, t' , C2) => 
Bel(A, -1 Bel{B , 7 , £2), U). 

(S)ProInform(A, B, e, I, N, to, £1 , t' , Ci)A RefuseNeed(B, A, t , I, N, t2,t3, t' , C2) => 
Bel (A, -1 Bel(B , InfoNeed(B, N, t' , C n ), 12), U). 

The following theorem can be proved using Axiom 1, 4 and Proposition 1. 

Theorem 1 . VA, B e TA, N,C n ,t,t' > t, 

Bel(A, InfoNeed(B, N, £', C n ),t ) A (7 = m/o(A, N )) A Bel(A, I, t ) A 
-<Bel(A, Bel{B, 7 , £'), f) =>■ 

( 3 fi, 7 2 , C p • Pot.Int.To(A, ProInform(A, B, t, I, N, £1, £ 2 , £', C„), t, £1 , Up)). 

It states that if agent A believes 7, which agent 77 will need by 7', it will 
consider proactively sending 7 to B by Prolnform. 

4.2 Proactive-Subscribe 

While an agent in a team can anticipate certain information-needs of teammates, 
it may not always be able to predict all of their information-needs, especially if 
the team interacts with a dynamic environment. Under such circumstances, an 
agent in a team needs to let teammates know about its information-needs so that 
they can provide help. There exists at least two ways to achieve this. An agent 
might merely inform teammates about its information-needs, believing that they 
will consider helping if possible, but not expecting a firm commitment from them 
for providing the needed information. Alternatively, the speaker not only wants 
to inform teammates about its information-needs, but also wishes to receive a 
firm commitment from teammates that they will provide the needed information 
whenever the information is available. For instance, let us suppose that agent B 
provides weather forecast information to multiple teams in some areas of a battle 
space, and agent A is in one of these teams. If agent A needs weather forecast 
information of a particular area in the battle space for certain time period, A 
needs to know whether agent B can commit to deliver such information to it. 
If agent B can not confirm the request, agent A can request another weather 
information agent or consider alternative means (such as using a broker agent). 

An agent’s choice between these two kinds of communicative actions obvi- 
ously depends on many factors including the level of trust between the speaker 
and the addressee, the criticality and the utility of the information-need, the 
sensing capability of the addressee, and the strength of the cooperative rela- 
tionship between them. However, here we focus on capturing the semantics of 
communicative actions without considering such factors, and leave the issue of 
choosing communication actions to agent designers. 
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The first kind of communication actions can be modeled as Inform(A , P , e, 
InfoNeed(A , TV, t", C n ),t , t'). That is, A informs B at time £ so that B will know 
at time £' that “A will need information described by N by t" under the context 
If agent P’ s reply to such an Inform action is Accept , B will consider (i.e., 
have a “potential intention”) to proactively deliver the needed information to A 
when relevant information is available to B. 

The second type of communication actions mentioned above is similar to 
subscription in the agent literature. In fact, subscription between two agents is a 
special case of subscription involving a “broker” agent. As the size of a team or 
the complexity of its task increases, the mental model about information-needs 
of teammates may vary significantly among members of the team. For instance, 
as the team scales up in size or task complexity, the team is often organized into 
subteams, which may be further divided into smaller subteams. Because (top- 
level) team knowledge might be distributed among several sub-teams, agents in 
one sub-team might not be able to know the team process (the plans, task assign- 
ments, etc.) of other subteams, and hence can not anticipate information-needs 
of agents in these subteams. To facilitate proactive information flows between 
these subteams, an agent in a subteam can be the designated point of contacts 
with other subteams. These broker agents play a key role in informing agents 
external to the subteam about information-needs of agents in the subteam. Sit- 
uations such as these motivate us to formally define the semantics of third-party 
subscribe (called 3PT Subscribe). Conceptually, 3PT Subscribe, issued by a bro- 
ker agent A to information provider C, forwards the information-needs of B to 
C and requests C to meet P’s needs whenever possible. When A and B are the 
same agent, it reduces to “subscribe”. 

It seems the semantics of 3 PT Subscribe involves a Request , since the speaker 
expects the addressee to perform the information delivery action to the needer. 
We might be attempted to model the communicative action as “A requests C 
to Inform B regarding P’s information need”. However, defined as such, C 
is demanded to reply based on C’s current belief (like a request to a database 
server). What we want to model is that if C accepts the request, C will commit 
to deliver relevant information whenever it becomes available. Neither can we 
model it as “A requests C to proactively inform P regarding P’s information 
need”, because it requires that agent C already know about P’s needs, which is 
not the case here. 

Failed to capture the semantics of 3 PT Subscribe in our mind by composing 
existing communicative actions, we introduce it as a new performative. Thus, by 
3PTSubscribe(A, B,C,e, N,ti,t 2 ,t- 3 ,C n ) we mean the action that A subscribes 
information-need N (as a broker) on behalf of agent P from agent C until time 
£3 under the context C n . The ultimate intent of the action is that A could hold 
the information relevant to N at time £3. The intermediate effect is to establish 
a mutual belief between A and C that (1) P has an information-need N by £3 
under the context C n , and (2) whenever C acquires new information about N, 
C intends to inform the information proactively to P as long as P still needs it. 
We formally define the semantics of 3PT Subscribe below. 
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Definition 5 . 3PTSubscribe(A, B,C,e, N,ti,t2,t3,C n ) — (fi < t2 < t 3 )?; Attempt 
(A, e, Bel(B, info(B , N),t3), 3 1" ■ (fi < t" < t2) A MB({A, C}, p, t"), C p , ti, t^), where 
p = 3tb ■ (t" <tb< t2) A Int.Th(A, ip A <p, ti,tb, C n ), 
ip = Bel(C, Bel(A, InfoNeed(B, N, f 3 , C n ), 1 1), tb), 

<j> = Int.Th(C, [Vt / < t3 ■ BChange(C, N,t') => 3 t a ,t c ■ Int.To(C, 

Prolnf orm(C, B, e , info(C, N), N, t a ,t c , t 3 , C n ),t', t a ,C n )],tb, tb, C n ), 
BChange{C,N,t) = info t {C,N)^info t -i(C,N) 5 , 

C p = Bel(A, InfoNeed(B, N, f 3 , Cn), ti) A Bel(A, defined(info(C, N)),ti)A 
-idefined(info tl (A, N)) A -<Bel(A, Bel(C, InfoNeed(B, N, t 3 , C n ),ti), 1 1). 

Notice that this definition requires the context of the information-need to 
be known to the addressee (agent C), since it is part of the mutual belief. This 
enables the information provider (agent C ) to avoid delivering unneeded infor- 
mation when the context no longer holds. 

A special case of “third-party subscribe” is the case in which the information 
needer acts as the broker agent to issue a subscription request on behalf of itself 
to an information service provider. Hence, a two party subscription action can 
be modeled as 3 PTSubscribe(A, A, C, e, N, ti,t2,t 3 ,C n ). 

Upon receiving a 3PT Subscribe request, the service provider (agent C in 
Definition 5 ) can reply in at least three ways. It can accept the request and 
commit to proactively delivering the needed information to agent B whenever 
the information changes. Alternatively, it can reject the request by letting A 
know that it has no intention to deliver information to B. Finally, it can accept to 
believe the information-need of B, but choose not to make a strong commitment 
to proactively informing B. This option still allows agent C to consider (i.e., 
potentially intend to) to Prolnform B later based on Theorem 1 , yet it gives 
agent C the flexibility to decide whether to commit to Prolnform. based on the 
current situation (e.g., take into account of C’s current cognitive load level). We 
call these three replies AcceptSub , RejectSub , and Weak Accept Sub respectively. 
They are formally defined below. 

Let Q = ( \/t ' < t 3 • BChange{C,N,t') => 3 t a ,t c ■ Int.ToiC, Prolnf orm{C, B,e ' , 
info(C, N), N, t a ,t c , t 3 , C„),t', t a ,C„)). 

AcceptSub(C,B,A,e,N,ti,t2,t3,C n ) = Attempt{C,e,ip,ip,C n ,ti,t2), 
RejectSub(C, B, A, e, N, £i , t 2 , £3 , C n ) = Attempt(C, e, rp, (p, C n ,t\,t2), 

W eak Accept Sub(C , B,A,e, N, £1 , £2, £3, C n ) = Attempt (C, e, p, p, C n ,t\,t2), where 
ip = MB({A, C }, Bel(C, InfoNeed(B, N, f 3 , C n ),t2)/\ Bel(C, Q, £2), <2), 

<P = MB({A, C}, Bel(C , Q, t 2 ),t 2 ), 
p = MB({A, C}, Bel(C, InfoNeed{B, N, t 3 ,C n ), t 2 ),t 2 ). 

Similar to Theorem 1 , an agent could assist its teammates by performing 
3PT Subscribe. The proof is based on the indirect effect of 3PT Subscribe, which 
can LEAD to Bel(B,info(B,N),t'). 

Theorem 2 . VA, B,C £ TA, N, C n , t, tl > t, 

Bel(A, InfoNeed(B, N, t' , C n ),t) A ->defined(info t (A, N )) A 
Bel(A,defined(info(C,N)),t ) A -<Bel(A, Bel(B,info(B, N),t'),t) => 

( 3 ti, t2, C p ■ Pot.Int.To(A, 3PT Subscribe(A, B, C, t, N, £1, £2, £ / , C n ), t, £1 , C p )). 

infot(C,N) means C evaluates N at t. 
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In addition to 3 PT Subscribe, there are at least two other ways a third-party 
agent can assist a team member with its information-needs: (1) Ask-Proinform: 
agent A asks agent C , then pro-informs agent B upon receiving replies from C, 
(2) request-inform: agent A requests agent C to Inform agent B directly (by 
composing request and inform together) 6 . 

In the Ask-Proinform approach, agent A needs to perform two communica- 
tive actions. The benefit is that A can also obtain the information as a by-product 
during the process. While in the second approach, agent A only needs to per- 
form one communicative action. The drawback is that agent A cannot obtain 
the information. 

An agent’s choice between these two approaches and the acts mentioned 
earlier (i.e., Inform-Info Need and 3PTSubscribe) could depend on the nature of 
the information-needs. For instance, if the information needed is static, request- 
inform is better than 3PT Subscribe, because the former relieves the information 
providing agent from monitoring a need for detecting changes. 



5 Conversation Policies with Proactiveness 

Intentional semantics of performatives is desirable because human’s choice of 
commitments to communicative acts really involves reasoning about the beliefs, 
intentions, and abilities of other agents. However, reliable logical reasoning about 
the private beliefs and goals of others is technically difficult. Practical agent sys- 
tems typically employ various assumptions to simply this issue. One promising 
approach is to frame the semantics of performatives using protocols or con- 
versation policies. As publicly shared, abstract, combinatorial, and normative 
constraints on the potentially unbounded universe of semantically coherent mes- 
sage sequences [14], conversation policies make it easier for the agents involved 
in a conversation to model and reason about each other, and restrict agents’ 
attention to a smaller (otherwise maybe larger) set of possible responses. 

Conversation protocols are traditionally specified as finite state machines [6, 
15]. Enhanced Dooley graplrs[16], Colored Petri Nets [17], and Landmark-based 
representation [18] were proposed to specify richer semantics of protocols. For 
instance, in Landmark-based representation, a protocol (family) is specified as a 
sequence of waypoints (landmarks) that must be followed in order to accomplish 
the goal associated with that protocol, while concrete protocols are realized by 
specifying action expressions for each landmark transition such that performing 
the action expressions provably results in the landmark transitions [18]. Here we 
only consider concrete protocols, which are viewed as patterns of communicative 
acts, and their semantics tie to those of the involved individual acts. 

One of our design criteria for conversation protocols is that it should be able 
to enhance team intelligence concerning about others’ information-needs by con- 
sidering the flow of information-needs as well as information itself. Figure 1 shows 

6 It’s different from PROXY with INFORM as the embedded act [13], which, like 
forward, requires the originating agent A already believes the information to be 
delivered. 
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the Petri-Net representation of a conversation protocol regarding Prolnform, 
where the applicable contexts and goal (let B know I) are encoded as predi- 
cates and kept in the start node and main end node (i.e., el), respectively. The 
protocol covers all the acceptable end points possibly occurring in conversations 
between agents A and B : terminate when B accepts, B keeps silent or refuses the 
pro-informed information, or when A accepts B ’s refusal of information needs. 




Fig. 1. A conversation policy involving Prolnform regarding information I 



One case is a bit involved, wherein agent A keeps trying to help B figure out 
his related information needs derived from I and appropriate inference knowl- 
edge. Suppose agent A initiates Prolnform to agent B about information I that 
A believes B will need, but B responds with RefuseNeed. A has two choices 
at this point: either accepts BA refusal and revises his beliefs regarding BA in- 
formation needs; or assuming B could not recognize her own information needs 
regarding I (e.g., due to lack of inference knowledge), A will take K as B's 
information-need that is closer than I to B' s purpose (e.g., action performing), 
and adopts another instance of Prolnform with K this time instead of I. Such 
recursive process may end when A chooses to accept BA refusal, or B clarifies 
to A that her refusal is not due to lack of certain inference knowledge (e.g., B 
regards A’s anticipation of her needs as wrong). 

It’s easy to show that the protocol is complete in the sense that no undis- 
charged commitments are left behind [18]. The protocol is also correct in the 
sense that successful execution of the protocol can achieve the goal of the pro- 
tocol (refer to Property 1). 

Conforming to the abovementioned criterion, we also designed a protocol in- 
volving communicative act 3 PT Subscribe as shown in Figure 2. There are three 
end points: either agent C accepts agent A’s subscription regarding agent B’s 
information needs, or C weakly accepts agent A’s subscription (i.e., C comes 
to believe B ’s information needs, but not makes an commitment) and agent A 
chooses to end this helping behavior by keeping silent, or C rejects A’s subscrip- 
tion, and A comes to take C’s view regarding BA needs 7 after being informed 
by C that C does not believe B will need /. 

7 That is, at state ST, agent A believes that agent C does not think B has a need 
regarding I. At the end state e3, A will revise his mental model about B’s needs. 
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Fig. 2. A conversation policy involving 3PT Subscribe 



Likewise, this protocol allows recursive invocations (with different third-party 
service providers): (1) at state s2 agent A chooses to continue helping B (Insist) 
by finding another potential information provider ( D ) and attempting to sub- 
scribe B’s needs from D ; (2) at state s3 agent C replies “Yes” to A’s query (i.e., 
C rejects A’s subscription under the situation that C himself is already aware of 
B’s needs 8 ); (3) at state si, instead of accepting C’s view on B’s needs, agent 
A insists on his/her own viewpoint of B’s needs and attempts to subscribe B’s 
needs from another known teammates. 

6 Comparison 

The reasoning of speech acts can be traced to the work of Austin [19], which 
was extended by Searle in [20] . In [5] , Cohen and Levesque modeled speech acts 
as actions of rational agents in their framework of intentions. Henceforward, 
several agent communication languages were proposed, such as Arcol [21], KQML 
[22], and FIPA’s ACL (<http://www.fipa. org/>). The formal semantics of the 
performatives in these languages are all framed in terms of mental attitudes. 

The way of defining semantics for performatives in this paper shares the same 
origin with those adopted in the abovementioned languages. A common element 
lies in the strictly declarative semantics of performatives. For example, Arcol uses 
performance conditions to specify the semantics of communicative acts. KQML 
adopts a more operational approach by using preconditions, postconditions and 
completion conditions. FIPA ACL is heavily influenced by Arcol, wherein the 
semantics of performatives are specified by feasibility preconditions and rational 
effect, both of which are formulas of a semantic language SL. The semantics of 
proactive performatives defined in this paper draws heavily on Cohen’s work on 
performative-as-attempt . 

8 Most likely, C cannot help B with B’s needs because C is not an information provider 
of I. In such a case, C’s reply is actually an indirect speech act, from which A can 
infer that C does not have (the observability regarding) I. However, there may exist 
other reasons, say, C is simply too busy. But anyway, at state sO agent A needs to 
revise his/her model of C appropriately. 
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The main difference is not in the way of defining semantics, but in the re- 
quirement of proactive performatives that prior to delivering information to an 
agent, the speaker needs to know (either by anticipating or being informed) 
that agent’s information-needs, which guarantees the information delivered is 
relevant to what the receiver should hold in order to participate team activities 
smoothly. To fully support such proactive communications, we also established a 
framework for reasoning other’s information needs. Needs-driven communication 
is also allowed partially in Arcol. For instance, in Arcol if agent A is informed 
that agent B needs some information, A would supply that information as if B 
had requested it by reducing the explicit inform to implicit request. However, 
in essence, agent A acts in a reactive rather than a proactive way, because Arcol 
lacks a mechanism for anticipating information needs as presented in this paper. 

Prolnform (proactive inform) defined in this paper is comparable with tell 
in KQML, although they are not equivalent per se. Both tell and Prolnform 
require that an agent cannot offer unsolicited information to another agent. The 
modal operator WANT in KQML, which stands for the psychological states of 
desire, plays the same role as InfoNeed. However, the semantics of WANT is 
left open for generality. InfoNeed can be viewed as an explicit way to expressing 
information-needs under certain context. 

Both 3PT Subscribe and broker -one in KQML involve three parties (they 
have different semantics, though) . However, 3 PT Subscribe is initiated by a bro- 
ker agent, while broker _one is not. Consequently, the speaker of dPTSubscribe 
needs to know the other two parties, while the speaker of broker one only needs 
to know the broker agent. Such difference results from the fact that we are fo- 
cusing on proactive information delivery by anticipating information-needs of 
teammates, while KQML does not. In our approach, if an agent does not know 
any information provider of information /, it could choose not to offer help. Of 
course, the needer itself could alternatively publish its needs to certain facilitator 
agent in its team, who then might initiate a request (involving three parties) to 
some known provider. In such a sense, broker _one(A, B , ask -if (A, X)) [22] 

can be simulated by publish and request. However, 3 PT Subscribe cannot be 
easily simulated in KQML. 

Proxy is defined in FIPA [10] as an Inform between the originating agent and 
the middle agent, which captures rather weaker third-party semantics. Stronger 
third-party semantics as we have introduced in this paper has independently 
defined for PROXY and PROXY- WEAK in [13]. Both PROXY and PROXY- 
WEAK are based on REQUEST. PROXY imposes significant commitments 
upon the intermediate agent, while PROXY- WEAK reduces the burden placed 
upon the intermediate agent. “PROXY of an INFORM” and “PROXY-WEAK 
of an INFORM” are different from 3 PT Subscribe. PROXY of an INFORM re- 
quires the middle agent believe the information that the speaker wants him/her 
to forward to the target agent. Even though PROXY-WEAK of an INFORM 
loosens this requirement, both still require the speaker already hold the infor- 
mation to be delivered. 3 PT Subscribe, focusing on information-needs, applies 
to situations when the speaker does not have the information needed by others. 
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To fully understand the ties between the semantics of communicative acts 
and patterns of these acts, conversation policies or protocols have been studied 
heavily in ACL field [23, 24, 6, 18, 25, 26]. The protocols proposed in this paper are 
rather simple, but they are helpful in understanding proactive communications 
enabled by proactive communicative acts and how information-needs flow. 

More recently, social agency is emphasized as a complement to mental agency 
due to the fact that communication is inherently public [27], which requires the 
social construction of communication be treated as a first-class notion rather 
than as a derivative of the mentalist concepts. For instance, in [15] speech acts 
are defined as social commitments, which are obligations relativized to both 
the beneficiary agent and the whole team as the social context. Kumar [18] ar- 
gued that joint commitments may simulate social commitments, because PWAG 
entails a social commitment provided that it is made public. We agree on this 
point. In our definition, the context argument of Prolnform and 3PT Subscribe 
includes the context of the information-need under concern. Thus, an informa- 
tion providing agent could terminate the information delivery service once the 
context is no longer valid. The contexts can be enriched to include protocols in 
force, as suggested in [6], and even social relations. 

To summarize, we are not proposing a complete ACL that covers various cat- 
egories of communicative acts (assertives, directives, commissives, permissives, 
prolribitives, declaratives, expressives) [27], nor are we focusing on the seman- 
tics of performatives alone. We are more concerned about information-needs and 
how to enable proactive information flows among teammates by reasoning about 
information-needs. The semantics of the performatives presented in this paper 
are motivated by our study about team proactivity driven by information-needs, 
and they rely on the speaker’s awareness of information-needs. 

7 Concluding Remarks 

In this paper we established a theory about proactive information exchanges by 
introducing the concept of “information-needs” , providing axioms for anticipat- 
ing the information-needs of teammates based on shared team knowledge such as 
shared team process and joint goals, and defining the semantics of Prolnform. 
and 3PT Subscribe based on the speaker’s awareness of the information-needs 
of teammates. It is shown that communications using these proactive performa- 
tives can be derived as helping behaviors. Conversation policies involving these 
proactive performatives are also discussed. 

Agent infrastructures like Grid [28] aim to enable trans-architecture teams 
of agents (a team consisting of subteams of agents with different architectures 
like TEAMCORE [29], D ’Agents [30], CAST [31]) to support joint activities by 
providing mechanisms for accessing shared ontologies, and for publishing and 
subscribing agents’ services. 3 PT Subscribe plays an important role in sharing 
information among hierarchical teams, but there is still a long way to go to fully 
support proactive communications among teams with heterogeneous agents. 




Proactive Communications in Agent Teamwork 



289 



The work in this paper not only serves as a formal specification for designing 
agent architectures, algorithms, and applications that support proactive infor- 
mation exchanges among agents in a team, it also offers opportunities for ex- 
tending existing agent communication protocols to support proactive teamwork, 
and for further studying proactive information delivery among teams involving 
both human and software agents. 
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Abstract. Protocols or part of protocols are frequently reused through 
projects if they are sufficiently generic. For instance, the Contract Net 
protocol can be used verbatim or in other protocols such as the sup- 
ply chain management. This reusability might be difficult to do due to 
the lack of reusability in current interaction protocol formalisms. In this 
paper, we present a new approach based on a modular architecture. A 
protocol is no longer monolithic but a composition of modules called 
micro-protocols. This approach improves modularity and reusability in 
interaction protocol engineering. We apply this idea to the example of 
supply chain management. 



1 Introduction 

Communication represents one of the main components in multiagent systems 
as stated in the Vowels approach [6] [7] under the letter I, which stands for 
Interaction. Traditionally, communication is described through protocols. Even 
if these protocols are as old as multiagent systems, designers are still facing 
challenges when designing communication. One of the main challenges is the 
lack of flexibility requiring designers to design protocols from scratch each time 
[22]. Designers cannot make profit of previous protocols to define new ones. 
This problem is due to the lack of flexibility in formal description techniques as 
stated in [16]. Formal description techniques do not consider that a protocol can 
be made of components (or modules) that represent a part of the interaction. For 
instance, one would like to represent that the interaction in supply chain problem 
exhibits a first phase where customers express what they want to purchase, a 
second phase where customers and companies negotiate price and delay and 
finally, a third phase where the different tasks for the production are processed: 
from task allocation to shipping. Above all, one would like to express that it 
is possible to replace a specific part of the protocol by another one because 
the current one no longer fits needs. A component-based approach for protocols 
would allow this flexibility and reusability. 
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A first attempt to make interaction protocols more flexible was done in [16]. 
Protocols are composed of micro-protocols that represent part of the interaction. 
This proposal exists and is, in the best of our knowledge, the only one which is 
in use in applications [23]. However, this approach remains simple and will not 
be of any help when designers will consider negotiation or protocols that belong 
to a same class: Contract Net protocol and Iterated Contract Net, that is to say, 
when expressiveness is at stake. 

The aim of this paper is to present a revision of the current approach to tackle 
a broader range of interaction and negotiation protocols. This revision passes 
by the addition of several new fields to increase the expressiveness of micro- 
protocols. Such fields are, for instance, the participant cardinality, the agent 
communication language, the ontology and the norms to name a few. A second 
refinement is the possibility that a micro-protocol contains micro-protocols; thus 
offering a hierarchy of decomposition. Finally, the last enhancement is the micro- 
protocol description as a XML file; although it is not presented here due to lack 
of space. It is then easier to read and retrieve information for both agents and 
designers. Moreover, documents can be generated from XML file. 

The paper is organized as follows. Section 2 justifies the aim of using micro- 
protocols in interaction protocols and the advantages. This section summarizes 
what was said in [16]. As stated above, the current version of micro-protocols 
is insufficient to represent negotiation protocols or a broad range of protocols; 
Section 3 explains why the current proposal missed its objectives. Section 4 
describes the new approach for micro-protocols. Section 5 considers the protocol 
level that binds micro-protocols together. Section 6 presents the example of 
supply chain management. Due to lack of space, we will focus here on only one 
micro-protocol that defines the negotiation between customers and companies. 
Section 7 concludes the paper and give future directions of work. 



2 Modularity in Interaction Protocols 

Traditionally, designers either use formal description techniques coming from 
distributed systems [11] to represent interaction protocols or define new ones to 
encompass agent features such as COOL [1], temporal logic [10] or Agent UML 
[20]. These approaches lack of flexibility as stated in [16]. It is then difficult to 
easily modify a protocol as soon as a part of it no longer fits designer needs. To 
cope with this problem, we proposed to consider a modular architecture to de- 
scribe interaction protocols [16]. The aim of such approach is to allow designers 
to combine these modules into a protocol to generate a specific need. Moreover, 
designers can easily replace a module by another if it no longer satisfies the 
needs. These modules are called micro-protocols to remind that they convey a 
part of the interaction usually carried by protocols. The language CPDL ( Com- 
munication Protocol Description Language) combines these micro-protocols to 
form a protocol. 

Several contributions are provided by a modular approach in interaction pro- 
tocols: 
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1. It should allow for the definition of reusable protocols. With the current 
formalisms, one cannot consider replacing a piece of protocol by another. 
This necessitates to look for the beginning and the end of such pieces. Fur- 
thermore, it appears it is difficult to identify the exact semantics of these 
protocol pieces. 

By representing a protocol as a combination of micro-protocols and providing 
a semantics for each of them, they can be reused in other protocols. In a 
modular approach, a protocol becomes more flexible since one just has to 
remove the micro-protocol that does not suit and replace it by another one. 
The only constraint is to correctly connect the new micro-protocol to the 
others already in place. 

2. It should allow for abstraction capabilities. Following a modular-based ap- 
proach introduces a meta-level. A basic level corresponds to elements per- 
taining to a micro-protocol. A higher level conveys the global protocol in 
a more abstract fashion, i.e. a plan. With the current approaches, a proto- 
col’s global view cannot easily be perceived. The use of micro-protocols hides 
implementation details which are useless for a protocol’s understanding. 

3. It should facilitate the validation process, We agree with the following idea 
[11]: “A well, structured protocol can be built from a small number of well- 
designed and well-understood pieces. Each piece performs one function and 
performs it well. To understand the working of the pieces from which it is 
constructed and the way in which they interact. Protocols that are designed 
in this way are easier to understand and easier to implement efficiently, and 
they are more likely to be verifiable and maintainable. ” With components, 
it becomes easier to validate a protocol because (1) each component is less 
complex that a whole protocol and can be separately verified, (2) a global 
protocol is then verified by checking that its micro-protocols are correctly 
connected together. 

The notion of modularity is not really addressed in interaction protocols, 
we can only quote work done around COSY [3] and AgenTalk [18]. Work on 
communication and cooperation in COSY have led to view a protocol as a com- 
bination of primitive protocols. Each primitive protocol can be represented by 
a tree where each node corresponds to a particular situation and transitions 
correspond to a possible message an agent can either receive or send. The main 
weakness of COSY is the semantic expression. Semantics is based on the first 
communicative act in the primitive protocol. As a consequence, it is difficult to 
distinguish two primitive protocols with the same prefix. 

Work in AgenTalk is different since there is the notion of inheritance. Such 
approach is interesting to derive protocols to specific tasks. Each protocol is 
represented as a finite state machine. Unfortunately, designers in AgenTalk need 
to link transitions from the composite protocol to the transitions within the 
component protocol. This approach reduces the reuse since designers need to 
study the component protocols before linking states. 

In the best of our knowledge, our approach seems to be the only one proposed 
in the literature and the only one in use [23]. 




294 Benjamin Vitteau and Marc-Philippe Huget 



3 Drawbacks of Current Micro-protocols 

In this section, we first describe the current version of micro-protocols before 
addressing the drawbacks of the current version. 



3.1 Current Version of Micro-protocols 

Micro-protocols in our approach represent the basic component in a protocol 
[17]. It contains an ordered set of communicative acts. Some programmatic no- 
tions are added to this set to improve the power of expression such as loops, 
decisions, time management, conditions and exceptions. Micro-protocols must 
be seen as functions in programming. Micro-protocols have parameters which 
are used within the micro-protocol. It is then possible to reuse several times 
the same micro-protocol in the same protocol but with different parameters. 
Usually, the parameters correspond to the sender, the receiver and the content 
of the message. The content corresponds frequently to the content of the first 
communicative act, next contents are computed according the interaction. 

Current micro-protocols are defined through four fields: 

— a name 

— a semantics 

— a definition 

— the semantics of the parameters 

The name is the name of the micro-protocol. Micro-protocols are distin- 
guished by their name; as a consequence, names must be unique. The semantics 
gives the semantics of the micro-protocol. This field is a free-format text. Design- 
ers can use either a natural language description or a set of logical formulae. The 
definition is the main part of micro-protocols. It contains the set of communica- 
tive acts. We add conditions, decisions, loops, time management and exceptions 
to increase micro-protocols’ expressiveness. Each communicative act has usually 
three parameters: the sender, the receiver and the content of the message. The 
semantics of the parameters gives the semantics associated to these parameters. 
It helps agents to fill the messages correctly. 

Thanks to its expressibility, it is possible to represent the Contract Net pro- 
tocol [5] as a micro-protocol. The micro-protocol is given on Figure 1. 

We do not explain in detail this micro-protocol, readers are urge to read [12]. 
The important elements in the definition held are the synchronization mecha- 
nism, the time management, alternatives and the exception. The keyword to- 
ken is used for the synchronization between agents. The number inserted into 
parentheses corresponds to the number of messages the receiver expects before 
resuming the interaction. If a star is used instead of a number, it corresponds to a 
number of agents which is known at run time. Not giving the number of messages 
expected offers more flexibility to designers, they are not obliged to hardwire this 
piece of information into the micro-protocol. The synchronization is realized on 
the next message. The time management is performed with the keyword time. 
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Name : ContractNet 
Parameters’ semantics'. 

A: sender 
B: receiver 
T: task 
P: proposal 
R: refuse 
Definition : 

cfp(A,B,T).token(*).time(10).(not-understood(B,A,T).exit | 

refuse(B,A,R).exit | propose(B,A,P)). 

(reject-proposal(A,B,R) | 
accept-proposal(A,B,P). 
exception{cancel=exit}.(failure(B,A,R) | inform(B,A,T))) 
Semantics : ContractNet 



Fig. 1. The Contract Net Protocol as a Micro-Protocol 



The number within the parentheses means that the receiver is waiting answers 
from senders during 10 time units. Putting token and time together has a dif- 
ferent semantics, it means that the receiver expects several answers but it waits 
at most 10 time units. When the deadline is passed, it follows the interaction 
even if it does not receive all the messages. This approach prevents from dead- 
locks in interaction. The main novelty of this micro-protocol is the exception 
management. This notion is barely proposed in agent communication. Moore 
seems to be the only one to consider it [19]. Exceptions are interesting to unlock 
situations, for instance if a receiver is waiting for n answers and only gets in 
(m < n); or if an agent receives an unexpected message. The exception is given 
within curly brackets as well as the action to do for this exception. For instance, 
on the Contract Net example on Figure 1, as soon as the task manager receives 
a cancel message, it triggers an exception and do the action exit corresponding 
to stop the interaction with this agent. Finally, alternatives represent different 
paths in the interaction. They are written within parentheses and separated by 
a vertical bar. An example is given on the first line of the definition where agents 
can answer either with a not-understood message, a refuse message or a propose 
message. We let readers consult [13] to find other examples. 

3.2 Critics against the Current Version 

Even if micro-protocols were successfully applied to the Baghera project [23], 
they remain simple and several critics can be formulated. This section summa- 
rizes them. 
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Micro-protocol Distinction. One of our critics on COSY’s approach is that 
it is not possible to describe a broad range of primitives due to the way to dis- 
tinguish them. The distinction is made on the first communicative act. Even 
if micro-protocols allow more possibilities, the problem of distinction is still 
present. Micro-protocols are distinguished by their name. Such an approach al- 
lows, a priori, an infinite number of micro-protocols but this approach falls down 
due to a lack of readability. Let us suppose that we want to express the Con- 
tract Net protocol as stated in Section 3, we straightly name the micro-protocol 
Contract Net. Now, let us suppose that we want two Contract Net protocols: 
the current one and an iterated version [9] . We naturally extend the name of the 
micro-protocol to express the different Contract Nets. This approach is quickly 
intractable as soon as micro-protocols only differ on some features such as the 
number of agents, the way to interact (broadcast, multicast), the negotiation 
phase, etc. 

To increase the power of distinction between micro-protocols, it seems rea- 
sonable to consider as many fields as necessary to well characterize micro- 
protocols. For instance, a field on the cardinality is interesting to distinguish 
micro-protocols dealing with one-to-one communication or one-to-many commu- 
nication. Several new fields are added to the current version of micro-protocols 
to increase the expressiveness as Section 4 demonstrates it. 



Semantic Weaknesses. The semantics field in the current version of micro- 
protocols is a free- format text. It means that designers are free to use either 
a natural language description, a set of logical formulae or whatsoever they 
want. The absence of a well-structured field for the semantics decreases the 
ability to describe accurately the micro-protocol. Designers can delay writing 
a formal description and as a consequence providing a micro-protocol without 
clear semantics. The example shown on Figure 1 clearly demonstrates this point 
since the semantics is reduced to ContractNet. 

In the new proposal, we advocate to increase the semantics definition by 
adding several fields. The semantics field is helped with fields such as prerequi- 
site, termination or type (see Section 4). 



Conditions on Execution. Conditions on execution are missing in the current 
version of micro-protocols. It is thus not possible to describe the conditions 
allowing or preventing such execution. Designers have no other choice that to 
describe this information outside micro-protocols. It is important that micro- 
protocols contain the conditions which are required. These conditions refer to the 
mental states of the agent which wants to use this micro-protocol, the conditions 
on the environment, etc. 

The current version has some conditions via CPDL [17]. CPDL is the lan- 
guage responsible to link the different micro-protocols. However, these conditions 
are not sufficiently fine grained to be worthwhile. Conditions are applied to a 
set of micro-protocols and not for each micro-protocol. 
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Agent Communication Languages and Ontologies. Even if micro-proto- 
cols are using communicative acts, the agent communication language and the 
ontology used are not written. The piece of information agent communication 
language is important in the context of interoperability; particularly if agents are 
heterogeneous. Writing the agent communication language used helps agents to 
know if they have to translate messages to other agents. The notion of ontologies 
allows agents to share a same meaning on terms used. 

Micro-protocols Defined for Agents. The first version of micro-protocols 
is directed to agents and not to designers. Designers give a name to the micro- 
protocol, describe the parameters and the semantics and finally insert the def- 
inition. Agents use them during their interactions. As soon as the domain of 
applications of such micro-protocols augments, it is important to increase the 
quality and the documentation of micro-protocols. At this moment, designers 
have to write a documentation separately since there is no way to store this 
information in the micro-protocol. The principle of reuse is transgressed since it 
is not possible to have both documentation and definition in the same element. 

The new version of micro-protocols attempts to fix this problem by inserting 
several new fields in the micro-protocol. Moreover, we propose to store the micro- 
protocol as a XML document [25]. Thus, the micro-protocol is readable by both 
designers and agents and it is possible for designers to document micro-protocols. 
Finally, another advantage of XML is XSLT which allows designers to generate 
documentation automatically. This point is in favour of reuse. 

Absence of Hierarchical Decomposition. In the current version of micro- 
protocols, there are three levels: the communicative act level, the micro-protocol 
level and the protocol level. Communicative acts are contained in micro- 
protocols. Micro-protocols are contained in protocols. Even if this decomposition 
offers large possibilities, it is better if the decomposition can be more important. 
In the new proposal, we still have the communicative level and the protocol level 
but it is now possible to have as many levels as we want between these two levels. 
Hence, a micro-protocol can be composed of micro-protocols. And consequently, 
the notion of micro-protocol itself must be thought differently and more broadly 
than the smallest piece of a protocol. 

4 Proposal for New Micro-protocols 

As stated in Section 3, the current version of micro-protocols is insufficient to 
represent negotiation protocols and protocols that differ a little bit. In this sec- 
tion, we present a new proposal based on the drawbacks found in Section 3. 
Concerns about the hierarchical decomposition are treated in Section 5. We 
only focus here on the micro-protocol content. 

Micro-protocols are still organized as a document with fields. However, sev- 
eral new fields were added to the previous version of micro-protocols. The micro- 
protocol fields are the following: 
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— Name/Type 

— Synonyms 

— Participant cardinality 

— Other attributes 

— Agent communication language 

— Ontology 

— Semantics 

— Function 

— Behavior 

— Roles 

— Initiator 

— Norms 

— Constraints 

— Prerequisite 

— Termination 

— Input /Output 

— Code sample 

The remaining of this section describes the different fields in the micro- 
protocols and which fields are used by designers and those used by agents. 

Name/Type: Name identifies the micro-protocol. The field Name should be 
unique to ease the selection of micro-protocols. However, in case of sev- 
eral micro-protocols sharing the same name, designers have to study the 
micro-protocols and particularly, the fields Function, Semantics and Other 
attributes to distinguish the different micro-protocols. Type refers to the abil- 
ity to categorize micro-protocols to help designers to select the correct micro- 
protocol. For instance, an example of Type could be Auction. Thus, this 
micro-protocol belongs to auction protocols. Designers can be as accurate as 
they want to express the type of the micro-protocol. Name is defined as a nat- 
ural language term. The usual way to represent Type is to use a natural lan- 
guage term but it is also possible to describe the Type as a path description 
referring to an object-oriented approach: Name \ : Name 2 : . . . : Name n . It 
means that the type of this micro-protocol is composed of several subtypes 
separated by colons. Types are ordered from left to right. As a consequence, 
Name2 is considered as a subtype of Name\ and so on. Type is separated 
from the Name by a slash. Several examples are given below: 

Contract Net 
English./ Auction 
Dutch/ Auct i on 
Vickrey/Auction 
Iterated/Contract Net 
Iterated/Contract Net : 1 to 5 
Iterated/Contract Net : 1 to n 
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Synonyms: English auction is one of the name used to represent this protocol, 
we can also use open outcry auction or ascending price auction. Synonyms 
gives the list of synonyms to help designers to browse list of micro-protocols. 
Synonyms are represented as a list of terms. 

Participant cardinality: In case protocols are described through Petri nets, 
considering a protocol for to participants is not the same than considering a 
protocol for n participants where to. ^ n. Two formats are possible for this 
held: to — n and m. In the first case, it means that m refers to the sender 
cardinality and that n refers to the recipient cardinality. Usual values are 
1 — 1, 1 — n, n — 1 , n — n refering to one-to-one communication, one-to-many 
communication, many-to-one communication and many-to-many communi- 
cation respectively. In the second case, to refers to the participant cardinality 
without addressing the sender cardinality and the recipient cardinality. In 
both cases, m and n must correspond to non-zero positive integers. 

Other attributes: The Held Other attributes depicts keywords that charac- 
terize the micro-protocols. It might be, for instance, that the interaction 
is only between authenticated agents, messages are encrypted, secured, or 
anonymous. Obviously, this list is not exhaustive and greatly depends on 
the context of the micro-protocol. Other attributes are denoted by a list of 
terms. 

Agent Communication Language: This Held refers to the language used by 
the agents in this micro-protocol. Usual values are KQML or FIPA ACL. It 
is thus possible to describe the same protocol with two micro-protocols, each 
one for a specific agent communication language. 

Ontology: This field indicates the ontology used in this micro-protocol. 

Semantics: Representing the semantics of a micro-protocol is important to 
help designers to understand the meaning of such a micro-protocol. The 
behavior field describes informally what the intention of the micro-protocol 
is. This field is insufficient to describe the micro-protocol due to ambiguities 
and misunderstandings of the natural language. It is important to describe 
formally the micro-protocol. The semantics field is defined to this purpose. 

Function: The Function field gives a summary of the intention of the micro- 
protocol. Actually, the Behavior field may be too long to read in the context 
of reuse. It is better if designers can read an abstract. That is the aim of this 
field. 

Behavior: The Behavior field is the main field in the micro-protocol since it 
provides a detailed description of the micro-protocol (the communicative 
acts, the message sequence and the message content). BDI modalities [24] 
and agent actions are also stored in this field. It is then possible for designers 
to know what is the impact of the interaction on agents. This is the field 
that designers will use to generate code for this micro-protocol. The Behavior 
field can be completed by a diagram that represents graphically the micro- 
protocol. Due to space restriction, we only give one example in Section 6. 
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Roles: Agents involved in interactions and interaction protocols are involved for 
a particular role. For instance in auctions, there are usually two roles auc- 
tioneer and participant. This piece of information is used at several levels: 
first knowing agents’ roles allow designers to know how agents will behave 
in the interaction. Knowing agents’ roles, designers know the responsibilities 
of each agent in the interaction and what they are expected to do. Sec- 
ond, this criterion can make a distinction between different micro-protocols. 
Two micro-protocols could have the same set of communicative acts but 
two different set of agents’ roles. For instance, in the Baghera project [23], 
the same micro-protocol request-information will not have the same mean- 
ing if it is applied between a pupil and his/her companion and between a 
pupil’s companion and the pupil’s tutor. In the first case, pupil asks whether 
his/her proof is correct, the companion answers yes or no. In the second 
case, companion and tutor can exchange several messages if they need more 
information about the proof. Finally, agents’ roles are used when defining 
norms for this micro-protocol. Norms are linked to agents’ roles and as a 
consequence to their roles in this interaction. Agents will be “punished” 
differently according to their roles and their responsibilities. 

The agents’ cardinality that defines the number of agents per roles is added 
to agents’ roles. For instance, for an auction there is one and only one auc- 
tioneer but at least two participants. This piece of information prevents from 
jeopardizing the system if this is an open system thus too many agents enter 
the system and consume the resources. Agents’ cardinality format is in — n 
where both m and n represent respectively the lower and the upper bound. 
There is at least m agents of this role and at most n agents. Lower and upper 
bounds are positive numbers. 

Initiator: This field gives which role is initiator for this interaction. The role 
must be a role defined in the Roles field. 

Norms: Norms were first considered in electronic institutions [8] to enforce 
agents to do some actions or to conform to some behaviors. Norms define 
what is permitted, what is obliged and what is forbidden. Norms have also a 
meaning in open systems to prevent actions that will jeopardize the system. 
Norms can be defined formally (see [8] to this purpose) or informally. Such 
an example of norms is: “The participant of an auction that gets the higher 
bid at the end of the auction must pay the amount corresponding to the bid 
except if the highest bid is less than the reserved price.” 

Constraints: Constraints correspond to constraints that must hold during the 
execution of the micro-protocol. These constraints are important when de- 
signers will check properties on micro-protocols. Constraints correspond to 
safety properties (nothing bad happens) and liveness properties (something 
good eventually happens). Constraints are defined formally via temporal 
logic or informally. Such examples of constraints are for instance mutual 
exclusion, deadlock but constraints can be also oriented to the goal of the 
micro-protocol, for instance, bids less than the current price are forbidden. 
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Prerequisite: The Prerequisite field describes the conditions that must be sat- 
isfied before executing the micro-protocol. Prerequisite can be defined for- 
mally through temporal logic or informally. Such example of prerequisites is 
agents must be authenticated before requesting information. 

Termination: The Termination field gives the valid exits for this micro- 
protocol. The exit of a micro-protocol is considered valid if the exit matches 
one defined in the Termination field. For the NetBill protocol [4], a valid 
exit is “the customer has the good and no longer the money, the seller has 
the money and no longer the good.” 

Input/Output: Micro-protocols can be linked to express that a protocol is a 
composition of micro-protocols. To this aim and to add reuse, it is worthwhile 
if micro-protocols can exhibit entry points and exit points. These points 
are data (called input and output respectively) that will be used in the 
current micro-protocol or in the next micro-protocol. Discussion on inputs 
and outputs is given in Section 5. 

Sample Code: Previous fields are more oriented to the definition and the doc- 
umentation of micro-protocols. This one is directed to the use of the micro- 
protocol in agents. Two kinds of information can be found in this field, either 
a sample code that gives the behavior of the micro-protocol in a program- 
ming language or an ordered set of communicative acts and micro-protocols 
that could be directly used by agents as defined in [12]. The former proposal 
corresponds to the usual approach for protocol synthesis as described in [11]. 
The latter is related to our interaction module where definition is directly 
executed [14]. 

Table 1 describes the use of these fields by designers and agents. 



Table 1. Use of Micro-Protocol Fields 



Field 


Designers 


Agents 


Name/Type 


s/ 


V 


Synonyms 


s/ 




Participant cardinality 


s/ 


s/ 


Other attributes 


s/ 




Agent communication language 


s/ 


V 


Semantics 


s/ 




Ontology 


s/ 


s/ 


Function 


s/ 




Behavior 


s/ 




Roles 


s/ 


s/ 


Initiator 


s/ 


s/ 


Norms 


s/ 


s/ 


Constraints 


s/ 


s/ 


Prerequisite 


s/ 


s/ 


Termination 


s/ 


s/ 


Input /Output 


s/ 


sj 


Sample code 


s/ 


sj 
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We briefly explain this table. All fields are useful for designers. Some of the 
fields are written in natural language. As a consequence, agents are unable to 
process them. Such fields are Other attributes, Function and Behavior. The field 
Semantics is written as a formal description but it will be really difficult for 
agents to figure it out. The field Synonyms cannot be applied to agents since we 
define it as a field for designers. The other fields can be used by agents providing 
that the description is structured to be understandable by agents. 

5 Protocol Level 

In Section 4, we described the micro-protocols. These micro-protocols can be 
decomposed into micro-protocols giving several levels of micro-protocols. At the 
lower end of this decomposition, there is the communicative act level containing 
the communicative acts exchanged in the messages. At the upper end of the 
decomposition, there is the protocol level that seizes all the micro-protocols 
used in the interaction protocol as shown on Figure 2. 




Protocol level 



Micro-protocol level 



Communicative act level 



Fig. 2. Decomposition of Interaction Protocols in our Model 



For instance on Figure 2, there are four levels: one protocol level, two micro- 
protocol levels and one communicative act level. The protocol level is different 
from the micro-protocol level as it can be seen as a meta-level addressing the 
aim of the interaction protocol and the structuration of the interaction. The 
distinction between protocols and micro-protocols is similar to the one existing 
between goals and plans in planning. 

Protocols are rendered graphically as diagrams as shown on Figure 3. A 
diagram contains the protocol name in the left upper corner in a “snippet” 
pentagon as UML 2 proposes it. Constraints on protocols are written in the 
right upper corner either formally (via OCL [21] for instance) or informally. The 
remaining of the diagram contains the different micro-protocols and the flow 
between them. 

Micro-protocols are rendered as rectangles with the name of the micro- 
protocol inside. Micro-protocols are linked together via transitions. Transitions 
are directed arrows with open arrowhead. Two kinds of information are written 
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x = [car, model X, red, 10,000 per unit, quantity = 7] 
y = [car, model X, red, 1 1,000 per unit, quantity = 7, delay = 4 weeks] 
z = [car, model X, red, 10,000 per unit, quantity = 7, delay = 3 weeks] 



Fig. 3. Supply Chain Protocol Representation 



o 

alternative parallel execution synchronization 

Fig. 4. Protocol Level Connector 




on transitions: (1) values used by a micro-protocol and (2) constraints on the exe- 
cution of the micro-protocol. Values can be supplied by previous micro-protocols 
or by agent knowledge base. Values can be represented textually on diagrams 
if there is enough space else designers can use variables and give the content of 
the variables below the diagram as shown on Figure 3. Constraints are nested 
into square brackets. Paths in protocols can be splitted or merged. Designers 
have connectors to this purpose as shown on Figure 4. The diamond connector 
represents the OR connector meaning that several alternatives are possible. One 
and only one alternative will be chosen. Parallel executions of micro-protocols 
are depicted through vertical bars with outgoing arrows. Each arrow corresponds 
to a parallel execution. Synchronization of these parallel executions are done via 
incoming arrows on vertical bars. The beginning of the protocol is depicted as 
a solid circle and an arrow pointing the initial micro-protocol or connector. The 
end of the protocol is denoted as a solid circle with a hollow circle around. There 
must be one and only one initial beginning but several endings are possible. 

6 The Example of Supply Chain Management 

To illustrate our matter, we have chosen to present here the example of Supply 
Chain Management. At first, we describe succinctly and in a simplistic view 
what Supply Chain Management is. Then we show the different phases of the 
process and how they can be matched with micro-protocols. And finally we give 
the complete depiction of one of the micro-protocols, as the limitations of this 
paper do not allow us to depict them all. 

In its global view, supply chain management renders the material, the infor- 
mation and the finance flows between customers, suppliers and manufacturers. 
It describes how materials are passing from suppliers to customers and how sup- 
pliers and manufacturers attempt to reduce stock. The information flow defines 
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Fig. 5. Agents in Supply Chain Management 



the ordering and the update of orders. Finally, the finance flow presents how and 
when payments will be made and under what conditions. Pragmatically speak- 
ing, the management of supply chain describes how customers pass orders and 
how suppliers process these orders. 

In this paper, we limit the view of supply chain management to the order 
of customers, the negotiation between customers and companies to fill the order 
and finally, the task allocation and distribution to process the order. 

We follow the idea of Barbuceanu and Fox [2] to agentify the management 
of supply chain as shown on Figure 5. The list of agents is given below: 

— Client 

— Order acquisition 

— Logistics 

— Scheduler 

— Dispatcher 

— Resource 

— Transporter 

— Provider 

— Worker 

Several agents are involved in this application as shown on Figure 5: 

— a client agent who places, modifies and deletes orders. 

— an order acquisition agent who receives the orders and negotiates the price 
and the delay with the logistics agent. It is responsible for the connection 
between clients and the company. It informs clients as soon as deliveries are 
postponed or delivered. 

— a logistics agent who negotiates the delay and the price with the order ac- 
quisition agent. As soon as orders are accepted, it asks the scheduler agent 
to generate a plan for this order. This plan is sent to the agents transporter, 
resource and dispatcher. It asks for a new plan as soon as some modifications 
appear either if the client wants to modify the order or if some delays arise 
on the production. 
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— a scheduler agent who generates a plan according to the order, the delay and 
the allocation of workers and resources. 

— a transporter agent who is responsible for the delivery of raw materials for 
the production and for the delivery of the final product to the client. 

— a resource agent who manages the allocation of materials in order to realize 
the order. If the materials are not available, it places an order to provider 
agents. 

— a dispatcher agent who manages worker agents. It allocates work to worker 
agents. The dispatcher agent informs the logistics agent if some problems 
arise occurring a delay for the delivery. 

— several worker agents who realize the work. If they are unable to complete 
their tasks at time, they inform the dispatcher agent. 

— several provider agents who provide raw materials if the company has not 
enough materials for the order. In fact, a provider agent is an order acquisi- 
tion agent from another company. 

As stated above, there are several phases in supply chain management: re- 
quest, evaluation, negotiation and finally task allocation. The request phase in- 
volves customers and order acquisition agents to define orders. Then, the compa- 
nies evaluate if this order can be processed in terms of products to be manufac- 
tured and constraints on company production and scheduling. If this evaluation 
fails, companies enter in a negotiation loop with customers to relax constraints 
such as prices, delays or features. Finally, companies allocate tasks to workers 
to fill the order. The production is finished when the order is delivered to the 
customer. 

All these different phases defined above are good candidates to be shaped in 
micro-protocols. Moreover, such phases are relatively reusable through projects. 
For instance, the request and negotiation micro-protocols can be used as soon as 
designers have to deal multiagent systems for commerce. It would be the same 
for task allocation used either in cooperation systems or in meeting scheduling. 

Several micro-protocols are proposed to represent the management of supply 
chain: 

1. Request micro-protocol, used by the client agents to contact the order 
acquisition agent of the company and send an order. 

2. Order evaluation micro-protocol, composed of several micro-protocols 
not detailed here which involve every agent of the company and whose func- 
tion is to evaluate if the order is achievable or not. 

3. Production failure micro-protocol, to inform the client that the com- 
pany is unable to process the order. 

4. Order negotiation micro-protocol used by clients and order acquisition 
agents to negotiate price, delay and whichever elements of importance. 

5. Tasks distribution micro-protocol, which implies the scheduler and the 
dispatcher agents to distribute the different tasks among the worker and the 
transporter agents in order to achieve the order. A Contract Net is performed 
prior to distribution to define which workers and transporters have to be 
involved. 
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The composition of these micro-protocols is depicted on Figure 3. This figure 
shows the data flow between micro-protocols. 

Due to lack of space, we only give here the order negotiation micro-protocol. 

Name/type: Order Negotiation / Negotiation 
Synonyms: none 
Participant Cardinality: 1-1 
Other Attributes: none 

Agent Communication Language: FIPA ACL 
Ontology: Negotiation, Sale 

Function: this protocol is employed when the order acquisition agent cannot 
satisfy the order, and needs to negotiate some points with the clients, most 
of the time, points refer to the delay or the price. The negotiation stage is 
based on proposals and counter-proposals and should reach an agreement 
else the negotiation fails. 

Behavior: The order acquisition agent first sends the new proposal to the client 
by issuing a proposal via the propose message. The content of the message is 
the new conditions proposed by the company. The client agent receives the 
proposal and has three choices: (1) accepting it (via the accept message), 
(2) rejecting it (via the reject message) and finally, (3) countering it (via the 
counter message). In the first case, the negotiation ends since an agreement 
is reached. In the second case, the negotiation fails since the client gave 
up. In the third case, the client modifies the proposal and sends its counter- 
proposal to the order acquisition agent. Proposals and counter-proposals can 
continue as long as the client does not answer either by the accept message 
or the reject message. The automaton corresponding to this micro-protocol 
is in Figure 6. 




Fig. 6. Order Negotiation Example 



Roles: Client, cardinality = 1 

Order Acquisition, cardinality = 1 

Initiator: Order Acquisition 

Constraints: agents are benevolent and do not try to stuck into a loop where 
no solution will appear. 

Prerequisite: Such criteria for the order to fill are not possible to satisfy. The 
order acquisition must have a counter-proposal which is processable by the 
company. 
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Termination: Three situations are possible for the termination: 

1. Client accepts the counter-proposal 

2. Order Acquisition rearranges the production to fill this order and satisfy 
the constraints 

3. Order cannot be processed since no agreement was found 
Input/Output: Input: counter-proposal 

Output: order with a set of agreed constraints or no order 



7 Conclusion 

Modularity and reusability are now two terms frequently in use in multiagent 
system design, particularly in agent design. Unfortunately, reuse is barely con- 
sidered when designing interaction protocols. As Singh [22] quoted it, designers 
need to start from scratch each time. A first proposal for reusability was made 
in [15]. This approach is based on micro-protocols and the CPDL language. 
Micro-protocols represent part of the interaction. They are composed through 
the CPDL language. This proposal allows flexibility, modularity and reusability 
but remains simple and in a certain way, missed its objectives. Actually, reusabil- 
ity is not properly addressed in this proposal since two protocols relatively similar 
will be difficult to represent and to distinguish. This paper attempts to fix this 
problem through the addition of several fields to increase the micro-protocol 
expressiveness. Moreover, micro-protocols can now be used by both agents and 
designers where previous version of micro-protocols are dedicated to agents. 

Several directions of work are already considered. First, we will generate 
micro-protocols for the protocols defined in the FIPA library [9] in order to 
exemplify this approach. Then, we will provide a library that will be accessible by 
both agents and designers. Designers can add, delete and modify micro-protocols 
through a Web interface. Agents can search and retrieve micro-protocols. The 
final step of our work on this approach will be to update the interaction module 
[14] and define an application that emphasizes the use of micro-protocols and 
their reuse. 
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Abstract. Dialogue protocols have been the subject of considerable attention 
with respect to their potential applications in multiagent system environments. 
Formalisations of such protocols define classes of dialogue locutions , concepts 
of a dialogue state , and rules under which a dialogue proceeds. One important 
consideration in implementing a protocol concerns the criteria an agent should 
apply in choosing which utterance will constitute its next contribution: ideally, 
an agent should select a locution that (by some measure) optimises the outcome. 
The precise interpretation of optimise may vary greatly depending on the nature 
and intent of a dialogue area. One option is to choose the locution that results in 
a minimal length debate. We present a formal setting for considering the problem 
of deciding if a particular utterance is optimal in this sense and show that this 
decision problem is both NP-hard and CO-NP-hard. 



1 Introduction 

Methods for modeling discussion and dialogue processes have proved to be of great im- 
portance in describing multiagent interactions. The study of dialogue protocols ranges 
from perspectives such as argumentation theory, e.g., [26,30], taxonomies of types of 
dialogue such as [30,33], and formalisms for describing and reasoning about proto- 
cols, e.g. [17, 23, 25]. Among the many applications that have been considered are bar- 
gaining and negotiation processes, e.g. [18,26,21]; legal reasoning, e.g. [15,20,2,3, 
29], persuasion in argumentation and other systems, e.g. [32, 12,4,9], and inquiry and 
information-discovery, e.g. [22, 24], The collections of articles presented in [8, 16] give 
an overview of various perspectives relating to multiagent discourse. 

While we present a general formal model for dialogue protocols below, informally 
we may view the core elements of such as comprising a description of the ‘locution 
types’ for the protocol (“what participants can say”)', the topics of discussion (“what 
participants talk about ”); and how discussions may start, evolve, and finish. 

Despite the divers demands of protocols imposing special considerations of interest 
with particular applications, there are some properties that might be considered desir- 
able irrespective of the protocol’s specific domain, cf. [25]. Among such properties are 
termination', the capability to validate that a discussion is being conducted according 
to the protocol; and the ability for participants to determine “sensible” contributions. 
In [17] frameworks for uniform comparison of protocols are proposed that are defined 
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independently of the application domain. In principle, if two distinct protocols can be 
shown ‘equivalent’ in the senses defined by [17], then termination and other properties 
need only be proved for one of them. 

In this paper our concern is with the following problem: in realising a particular 
discussion protocol within a multiagent environment, one problem that must be ad- 
dressed by each participant can, informally, be phrased as “what do/should I say next ?” 
In other words, each agent must “be aware of” its permitted (under the protocol rules) 
utterances given the progress of the discussion so far, and following specific criteria, 
either choose to say nothing or contribute one of these. While the extent to which a 
protocol admits a ‘reasonable’ decision-making process is, of course, a property that 
is of domain-independent interest, one crucial feature distinguishing different types of 
discussion protocol is the criteria that apply when an agent makes its choice. More pre- 
cisely, in making a contribution an agent may be seen as “optimising” the outcome. A 
clear distinction between protocol applications is that the sense of “optimality” in one 
protocol may be quite different from “optimality” in another. For example, in multia- 
gent bidding and bargaining protocols, a widely-used concept of “optimal utterance” 
is based on the view that any utterance has the force of affording a particular “utility 
value” to the agent invoking it that may affect the utility enjoyed by other agents. In 
such settings, the policy (often modeled as a probability distribution) is “optimal” if 
no agent can improve its (expected) utility by unilaterally deviating. This - Nash equi- 
librium - has been the subject of intensive research and there is strong evidence of its 
computational intractability [5], While valid as a criterion for utterances in multiagent 
bargaining protocols, such a model of “optimality” is less well-suited to fields such as 
persuasion, information-gathering, etc. We may treat a “persuasion protocol” as one in 
which an agent seeks to convince others of the validity of a given proposition, and inter- 
preting such persuasion protocols as proof mechanisms - a view used in, among others, 
[32, 12] - we contend that a more appropriate sense of an utterance being “optimal”, is 
that it allows the discussion to be concluded “as quickly as possible” 1 . There are sev- 
eral reasons why such a measure is appropriate with respect to persuasion protocols. In 
practice, discussions in which one agent attempts to persuade another to carry out some 
action cannot (reasonably) be allowed to continue indefinitely; an agent may be unable 
to continue with other tasks which are time-constrained in some sense until other agents 
in the system have been persuaded through some reasoned discussion to accept partic- 
ular propositions. It is, of course, the case that describing optimality in terms of length 
of discussion provides only one measure. We discuss alternative notions of optimality 
in the concluding sections. 

Concentrating on persuasion protocols we formulate the “optimal utterance prob- 
lem” and establish lower bounds on its complexity. In the next section we outline an 
abstract computational framework for dialogue protocols and introduce two variants of 
the optimal utterance decision problem. In Section 3 we present a setting in which this 
problem is proved to be both NP-hard and CO-NP-hard. Conclusions and further work 
are presented in the final section. 



1 An alternative view is proposed in [11], where it is argued that utterances which prolong dis- 
cussions can, in certain settings, be seen as “optimal”. 
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2 Definitions 

Definition 1 Let T be the (infinite) set of all well-formed formulae (wjf) in some propo- 
sitional language ( where we assume an enumerable set of propositional variables X\, 

x 2, ■■■)■ 

A dialogue arena, denoted A, is a ( typically infinite) set of finite subsets of IF. For a 
dialogue arena, 

A = {L> <I>, C F 

the set ofwffin d>i = {f>\, t/> 2 , • • • , fiq} is called a dialogue context from the dialogue 
arena A. 

Definition 2 A dialogue schema is a triple (C. T). L>), where C = {Lj |1 < 3 < 1} 
is a finite set of locution types, T) is a dialogue protocol as defined below, and is a 
dialogue context. 

We are interested in reasoning about properties of protocols operating in given dia- 
logue arenas. In the following, A = [0| , is a dialogue arena, with <P = 

{- 01 , . . . , tp q } a (recall, finite) set of wff constituting a single dialogue context of this 
arena. 

Definition 3 Let C = {Lj |1 < j < 1} be a set of locution types. A dialogue fragment 
over the dialogue context L> is a (finite) sequence. 

Pi ■ P2 • • • dk 

where im = Lj tt (0t) is the instantiated locution or utterance (with 0 t £ d>) at time t. 
The commitment represented by a dialogue fragment 5 - denoted ^(ci) - is a subset of 
the context <L>. 

The notation Mf ^ is used to denote the set of all dialogue fragments involving 
instantiated locutions from C; 6 to denote an arbitrary member of this set, and |i5| to 
indicate the length (number of utterances) in S. 

In order to represent dialogues of interest we need to describe mechanisms by which 
dialogue fragments and their associated commitments evolve. 

Definition 4 A dialogue protocol/or the discussion of the context using locution set 
C- is a pair V = {LI, £) defined by: 

a. A possible dialogue continuation function - 

n : Ml# - p(£ x <?) U {1} 

The subset of dialogue fragments 6 in $ having 11(5) L is called the set of 
legal dialogues over (C, L>) in the protocol V, this subset being denoted T-p. It is 
required that the empty dialogue fragment, e containing no locutions is a legal dia- 
logue, i.e. II (e) _L, and we call the set II (e) the legal commencement locutions 2 . 

We further require that II satisfies the following condition: 

2 Note that we allow 17(e) = 0, although the dialogues that result from this case are unlikely to 
be of significant interest. 
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V<5 g M* c ^ {n{6) = ,l) ^(V/t = Cj(0) n{5 ■ p) = _L) 

i.e. if 5 is not a legal dialogue then no dialogue fragment starting with 6 is a legal 
dialogue. 

b. A commitment function - £ : 7p — > p(<7) associating each legal dialogue with a 
subset of the dialogue context T>. 

This definition abstracts away ideas concerning commencement, combination and 
termination rules into the pair (77, £) through which the possible dialogues of a pro- 
tocol and the associated states (subsets of <P) are defined. Informally, given a legal dia- 
logue, S, n(S) delineates all of the utterances that may be used to continue the discus- 
sion. 

A dialogue, 5, is terminated if LI (6) = 0 and partial if 11(6) 0. 

We now describe mechanisms for assessing dialogue protocols in terms of the length 
of a dialogue. The following notation is used. 

A = {A k } = {(C,V=(n, £),<P k }} 

is a (sequence of) dialogue schemata for an arena 

A= 

Although one can introduce concepts of dialogue length predicated on the number of 
utterances needed to attain a particular state <9, the decision problem we consider will 
focus on the concept of “minimal length terminated continuation of a dialogue fragment 
6 ”. Formally 

Definition 5 Let (£, T> = (77, £), <P k ) be a dialogue schema A k instantiated with the 
context <P k of A. Let 6 £ A7* £ ^ be a dialogue fragment. The completion length of 6 
under V for the context <P k , denoted \(<5, 1?, <P k ), is, 

min { 1 77 1 : 7 / e 7p, 77 = <5 • (, n(rj) = 0} 

if such a dialogue fragment exists, and undefined otherwise. 

Thus the completion length of a (legal) dialogue, <5, is the least number of utterances in 
a terminated dialogue that starts with <5. We note that if 5 is not a legal dialogue then 
X(<5, V, <P k ) is always undefined. 

The decision problem whose properties we are concerned with is called the Generic 
Optimal Utterance Problem. 

Definition 6 ,4/7 instance of the Generic Optimal Utterance Problem ( GOUP) comprises, 

U=(A,6,p) 

where A = (£, V, <P) is a dialogue schema with locution set C, protocol T> = (IT, £), 
and dialogue context <P; 6 € M ^ is a dialogue fragment, and p £ C x T> is an 
utterance. 

An instance IA is accepted if there exists a dialogue fragment 77 £ M* c ^ for which 
all of the following hold 
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1. T] = 5 ■ p • C € T v . 

2. n(r ] ) = 0 . 

5. \r i \= X {S,T>,0). 

If any of these fail to hold, the instance is rejected. 

Thus, given representations of a dialogue schema together with a partial dialogue, <5 
and utterance // , an instance is accepted if there is a terminated dialogue (?y) which 
commences with the dialogue fragment S ■ p and whose length is the completion length 
of 5 under V for the context 4>. In other words, the utterance p is such that it is a legal 
continuation of 5 leading to a shortest length terminated dialogue. 

Our formulation of GOTJP, as given in Definition 6, raises a number of questions. The 
most immediate of these concerns how the schema A is to be represented, specifically 
the protocol (77, £). Noting that we have (so far) viewed (II, £) in abstract terms as 
mappings from dialogue fragments to sets of utterances (subsets of the context), one 
potential difficulty is that in “most” cases these will not be computable^ . We can go 
some way to addressing this problem by representing (II, £) through (encodings of) 
Turing machine programs (Mn, Ms) with the following characteristics: Mn takes as 
its input a pair (5, p), where <5 £ ^ and p £ £ x <P, accepting if S ■ p is a legal 

dialogue and rejecting otherwise; similarly Ms takes as its input a pair (S, I') with 
£ <1 accepting if S is a legal dialogue having <7 £ !?((>), rejecting otherwise. There 
remain, however, problems with this approach: it is not possible, in general, to validate 
that a given input is an instance of GOUP, cf. Rice’s Theorem for Recursive Index Sets in 
e.g., [10, Chapter 5, pp. 58-61]; secondly, even in those cases where one can interpret 
the encoding of (II, £) “appropriately” the definition places no time-bound on how 
long the computation of these programs need take. There are two methods we can use 
to overcome these difficulties: one is to employ ‘clocked’ Turing machine programs, 
so that, for example, if no decision has been reached for an instance (5, p) on Mn 
after, say |<5 • p\ steps, then the instance is rejected. The second is to consider specific 
instantiations of GOUP with protocols that can be established “independently” to have 
desirable efficient decision procedures. More formally, 

Definition 7 Instances of the Optimal Utterance Problem in A — Ot.T'" 1 ' - where 
{Z\} = {(£,T> = (II, £) ,<Pf;)} is a sequence of dialogue schema over the arena 
A = : k > 1}, comprise 

U = ($ k ,8,p ) 

where S £ M* c ^ is a dialogue fragment, and p £ £ x is an utterance. 

An instance IA is accepted if there exists a dialogue fragment p £ M* c <p k )f or which 
all of the following hold 

1. p = S ■ p ■ ( £ Tt>. 

2. 7T(r;) = 0. 

5. |r?| = X {S,T>,0 k ). 

If any of these fail to hold, the instance is rejected. 

3 For example, it is easy to show that the set of distinct protocols that could be defined using 
only two locutions and a single element context is not enumerable. 
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The crucial difference between the problems GOUP and OUP^ is that we can con- 
sider the latter in the context of specific protocols without being concerned about how 
these are represented - the protocol description does not form part of an instance of 
OUP 1 ^’ (only the specific context In particular, should we wish to consider some 
‘sense of complexity’ for a given schema, we could use the device of employing an 
‘oracle’ Turing machine, Ma, to report (at unit-cost) whether properties (1-2) hold of 
any given 77 . With such an approach, should A be such that the set of legal dialogues 
for a specific context is finite, then the decision problem OUP fzi) is decidable (relative 
to the oracle machine M a). A further advantage is that any lower bound that can be 
demonstrated for a specific incarnation of OUP-^- 1 gives a lower bound on the “com- 
putable fragment” of GOUP. In the next section, we describe a (sequence of) dialogue 
schemata, {A PPLL } for which the following computational properties are provable. 

1 . The set of legal dialogues for A PPLL is finite: thus every continuation of any legal 
partial dialogue will result in a legal terminated dialogue, 

2. Given (S, p, <Pk) with 6 a dialogue fragment, p an utterance, and 4>k the dialogue 
context for A PPLL , there is a deterministic algorithm that decides if 6 ■ p is a legal 
dialogue using time linear in the number of bits needed to encode the instance. 

3. Given (5, d>, ( Pfj with 5 a legal dialogue and an element of the context 0/,, there 
is a deterministic algorithm deciding if if' G A (Vi) using time linear in the number 
of bits needed to encode the instance. 

We will show that the Optimal Utterance Problem for A PPLL is both NP-hard and 
CO-NP-hard. 

3 The Optimal Utterance Problem 

Prior to defining the schema used as the basis of our results, we introduce the dialogue 
arena, Acnf upon which it operates. 

Let 0(n) ( n > 1) denote the set of all CNF formulae formed from propositional 
variables {x \, . . . , x n } (so that |6>(n)| = 2 3 "). For d/ G Q{n) with 

771 i r 

& = V y%,j Vi,j e {x k , -1 Xk : 1 < k < n} 

i=i j = 1 

we use Ci to denote the clause V*L-[ y-,., . Let d/ rep be the set of wff given by, 

Ui, • • • , C m , X\, • - - , X n , ~^X\, • • • , “ 1 "X n } 

The dialogue arena of formulae in CNF is 

OO 

A:NF = {{^rep}} 

n=l 1 ?G©(n) 

Thus, each different CNF, 'P gives rise to the dialogue context whose elements are de- 
fined by •f'rep. 
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We note that £ Acnf may be encoded as a word, /?(<£), over alphabet {—1, 0, 1} 
/3(<P) = l n 0 a with a € {—1, 0, l}" m 
where the i’th clause is described by the sub- word 

l)*n+l • • • 

so that 

&(i—l)n+k 1 if —'•Kk ^ Ci 

Q(i— l)n+k f if -C k £ Ci 

i)n+k — 0 if ^ Ci and x ^ ^ C) 

It is thus immediate that given any word it> G {—1, 0, 1}* there is an algorithm that 
accepts w if and only if w = /3(<P) for some CNF $ and this algorithm runs in 0(|u>|) 
steps. 

The basis for the dialogue schema we now define is found in the classic DPLL pro- 
cedure for determining whether a well-formed propositional formula is satisfiable or 
not [6, 7], Our protocol - the DPLL-dialogue protocol - is derived from the realisation 
of the DPLL-procedure on CNF formulae. 

In describing this we assume some ordering 



of the contexts in the arena Acnf- 

DPLL-Dialogue Schema. The sequence of DPLL -Dialogue Schema - Adpll = 
{A D pLL } 

- is defined with contexts from the arena Ac nf as 

/ \D pLL = ( CoplLi'DdPLL = {IIdPLL, ^DPLL},^k} 

where 

C DPLL = {assert,rebut,propose,deny,mono,unit} 
and the set <Pk from Acnf is, 

m 

{ C i , , . . . , Crn 5^1,..., X n . “ 'X r , . . . , “ 'X n } 

i—1 

Recall that <7^ denotes the formula ™ C), and Cj is the clause '£ l j =1 yiy from the 
context It will be convenient to regard a clause C both as a disjunction of literals 
and as a set of literals, so that we write y £ C when C has the form 2/ V 73. 

The protocol (npppp, Sdpll) is defined through the following cases. 

At any stage the commitment state - Sdpll(S) consists of a (possibly empty) 
subset of the clauses of '■I’k and a (possibly empty) subset of the literals, subject to 
the condition that y and are never simultaneously members of Sdpll(S). With the 
exception of {ASSERT, REBUT} the instantiated form of any locution involves a literal y. 
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Case 1: 8 = e the empty dialogue fragment. 

n D PLL(e) = {ASSERT(^ fc )} 

£dpll(c) = 0 

Sdpll (assert^)) = {Ci : 1 < i < to} 

In the subsequent development, y is a literal and 

Open (5) = {Ci : Ci £ E DPL l(S)} 

Lits(S) = {y : y £ Sdpll(S)} 

Single(6) = {y : ->y Lits(S) and3C £ Open(6) s.t. 

y £ C cmd\/z £ C/{y{ ~<z £ Lits(S)} 

Unary(5) = {y : -<y Lits(S) and 

VC £ Open{8) ->y ^ C and 
3 C £ Open{6) with y £ C} 

Bad(S) = {Ci : Ci £ Open(S) and 
Vy £ C ->y £ Lits(S)} 

Informally, Open(5) indicates clauses of tf'fc that have yet to be satisfied and Lits(5) 
the set of literals that have been committed to in trying to construct a satisfying assign- 
ment to Iffc. Over the progress of a dialogue the literals in Lits(S) may, if instantiated 
to true, result in some clauses being reduced to a single literal - Single(5) is the set of 
such literals. Similarly, either initially or following an instantantiation of the literals in 
Lits{6) to true, the set of clauses in Open(S) may be such that some variables occurs 
only positively among these clauses or only negated. The corresponding literals form 
the set Unary{6). Finally, the course of committing to various literals may result in a 
set that contradicts all of the literals in some clause: thus this set cannot constitute a 
satisfying instantiation: the set of clauses in Bad(S) if non-empty indicate that this has 
occurred. Notice that the definition of Single(S) admits the possibility of a literal y and 
its negation being in this set: a case which cannot lead to the set of literals in Lits(5) 
being extended to a satisfying set. Thus we say that the literal set Lits(6) is a failing 
set if either Bad(S) 0 or for some y, {y, ->y} C Single(S). 

Recognising that £dpll{ 5) = Open(5) U Lits(6) it suffices to describe changes 
to in terms of changes to Open(5) and Lits(5). 

Case 2 : 8 e, Open{8) = 0 

n(S) = 0 

Case 3: S ^ e, Open(8) ^ 0, Lits(8) is not a failing set. 

There are a number of sub-cases depending on £dpll(8) 

Case 3.1: Single(S) 0. 

n D PLL{8) = {UNIT (y) : y £ Single(S)} 

Open{5 ■ unit(j/)) = Open(S)/{C : y £ C} 

Lits{8 ■ UNlT(y)) = Lits(S) U {y} 

Case 3.2: Single(S) = 0, Unary(S) 0 

IIdpll{S) = {MONO(y) : y £ Unary(S)} 

Open(8 ■ MONO(y)) = Open(8)/{C : y £ C } 

Lits{5 ■ MONO(y)) = Lits{8) U {y} 
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Case 3.3: Single{8) = Unary(8) = 0 

Since Bad(8) = 0 and Open(8 ) ^ 0, instantiating the literals in Lits(8) will neither 
falsify nor satisfy Pj ; . It follows that the set 

Poss(S) = {y : y ^ Liis[8 ), -<y (jL Lits(S) and 
3 C £ Open(S) with y £ C} 

is non-empty. We note that since Unary(S) = 0, y £ Poss(8) if and only if sy £ 
Poss(8). This gives, 

n DPLL {5) = { PROPOSE^) ; y £ Poss(S)} 

Open(S • propose^)) = Open{8)/{C : y £ C} 

Lits{8 • PROPOSE(y)) = Lits(S) U {y} 

This completes the possibilities for Case 3. We are left with, 

Case 4: 8 e, Lits(8) is a failing set. 

Let 8 = ASSERT (P k ) • • • p t 

Given the cases above, there are only three utterances that pt could be: 

Pt e {ASSERT(^fc), PROPOSE(y), DENY(?/)} 

Case 4.1: p t = p-\ = assert^) 

Sinces Liis(ASSERT(<?y,.)) = 0, Pt- either contains an empty clause (one containing 
no literals), or for some x both (x) and (->a:) are clauses in Pjf. In either case Pk is 
“trivially” unsatisfiable, giving 

n DPLL { ASSERT(i'fc)) = {REBUT(^fc)} 

SdPLL (ASSERT(!f fc ) • REBUT(!f fc )) = 0 
n D PLL (ASSERT(^fc) • REBUT^)) = 0 

Case 4.2: p t = PROPOSE(y) 

n D PLL{8) = {DENY(y)} 

Open(5 • DENY(y)) = Open(p 1 • • • p t -i)/{ C : ->y £ C} 

Lits{8 ■ DENY(y)) = Lits(p\ ■ ■ ■ pt-i) U {->j/} 

Notice this corresponds to a ‘back-tracking’ move under which having failed to com- 
plete a satisfying set by employing the literal y, its negation —y is tried instead. 

Case 4.3: p t = DENY(y) 

Consider the sequence of utterances given by 

V = A*2 • M3 ' ' • Mi-1 ' Pt = DENY(y) 

We say that ty is imbalanced if there is a position p such that p p = PROPOSE^) with 
DENY(z) ^ Pp+i ■ ■ ■ Pt and balanced otherwise. If rj is unbalanced let index (rj) be the 
highest such position for which this holds (so that p < t). 

We now obtain the final cases in our description. 

4 Note that we distinguish the wff y (a literal used in PP) and ( y ) (a clause containing the single 
literal y) within the context Pk ■ 
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Case 4.3(a): ?y is unbalanced with indexff) equal to p. 

n DPLL (S) = {DENY (y) : p p = PROPOSE(y)} 

Open(S • deny (y)) = Open(p\ ■ ■ ■ p p ~i )/{C : ~<y € C} 

Lits(5 • DENY(y)) = Lits(ni ■ ■ ■ p p -i) U {^y} 

Thus this case corresponds to a ‘back- tracking’ move continuing from the “most 
recent” position at which a literal —y instead of y can be tested. 

Finally, 

Case 4.3(b): p is balanced. 

n D PLL {$) = {REBUT(ffc)} 

ZdPLl(S ■ REBUT(!f fc )) = 0 
n D PLL(S ■ REBUT(!f fc )) = 0 

We state the following without proof. 

Theorem 1 In the following, 5 is a dialogue fragment from ^*c DPLL 4 > k )l a 

context from Acnf . and N{8, <Pk) is the number of bits used to encode S and <Pk under 
some reasonable encoding scheme. 

1. The problem of determining whether 6 is a legal dialogue for the protocol T)dpll 
in context <T>k can be decided in 0{N{8 1 &k)) steps. 

2. The problem of determining whether <5 is a terminated legal dialogue for the proto- 
col Vdpll in context <Pk is decidable in 0(N(S, &k)) steps. 

3. For any ip £ ( Ip ; , the problem of determining whether ip £ £dpll{8 ) is decidable 
in 0(N(S , <Pk)) steps. 

4. For all contexts £ Acnf, the set of legal dialogues over in the protocol 
Vdpll is finite. 

5. If 5 is a terminated dialogue ofVnPLL in context <T>k then Edpll(S) 0 if and 
only if'Fk is satisfiable. Furthermore, instantiating the set of literals in Lits(S) to 
true, yields a satisfying assignment for \ Tk. 

Before analysing this protocol we review how it derives from the basic DPLL- 
procedure. Consider the description of this below. 

DPLL-Proeedure. 

Input: Set of clauses C 
Set of Literals L 

if (7=0 return true. (SAT) 
if any clause of C is empty 

or C contains clauses (y) and (-i y) (for some literal y) 
return false, (unsat) 

if C contains a clause containing a single literal y 
return DPLL(C'l y , L U {y}) (u) 

if there is a literal y such that —y does not occur in any 
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clause (and y occurs in some clause) 



return DPLL(Cl y , L U {y}) 


(M) 


choose a literal y. 


(B) 


if DPLL(C'l y , L U {y}) 




then return true 




else return dpll(C^ v , L U {->y}) 


(FAIL) 



For a set of clauses and literal, y, the set of clauses C' v is formed by removing all 
clauses, for which y (2 C, and deleting the literal ->y from all clauses Cj having 

~ i y e Cj. 

To test if If - = A is satishable, the procedure is called with input C = 
{Cl, . . . , C m } and L = 0. 

Lines (u) and (M) are the “unit-clause” and “monotone literal” rules which improve 
the run-time of the procedure: these are simulated by the UNIT and MONO locutions. 
Otherwise a literal is selected - at line (B) - to “branch" on: the PROPOSE locution; 
should the choice of branching literal FAIL to lead to a satisfying assignment, its nega- 
tion is tested - the DENY locution. Each time a literal is set to true, clauses containing 
it can be deleted from the current set - the Open(S) of the protocol; clauses containing 
its negation contain one fewer literal. Either all clauses will be eliminated (C is satisfi- 
able) or an empty clause will result (the current set of literals chosen is not a satisfying 
assignment). When all choices have been exhausted the method will conclude that C is 
unsatishable. 

The motivation for the form of the dialogue protocol A PPLL is the connection 
between terminated dialogues in Tdpll and search trees in the DPLL-procedure above. 

Definition 8 Given a set of clauses C, a DPLL-search tree for C is a binary tree, S, 
recursively defined as follows: if C = 0 or C conforms to the condition specified by 
UNSAT in the DPLL-procedure, then S is the empty tree, i.e. S contains no nodes. If y 
is a monotone literal or defines a unit-clause in C, then S comprises a root labelled y 
whose sole child is a DPLL -search tree for the set C^ y . If none of these four cases apply, 
S consists of a root labelled with the branching literal y chosen in line (B) with at most 
two children - one comprising a DPLL-search tree for the set CK the other child - if 
the case (FAIL) arises - a DPLL -search tree for the set C^ v . 

A DPLL -search tree is full if no further expansion of it can take place ( under the 
procedure above). 

The size of a DPLL -search tree, S - v(S) - is the total number of edges 5 in S. A full 
DPLL-search tree, S, is minimum for the set of clauses C, if given any full DPLL-search 
tree, Rfor C, v(S) < o(R). Finally, a literal y is an optimal branching literal for a 
clause set C, if there is a minimum DPLL -search tree for C whose root is labelled y. 

We say a set of clauses, C, is non-trivial if C f 0. Without loss of generality 
we consider only CNF-formulae, \P, whose clause set is non-trivial. Of course, during 
the evolution of the DPLL-procedure and the dialogue protocol Vdpll sets of clauses 
which are trivial may result (this will certainly be the case is 'f' is satisfiable): our as- 
sumption refers only to the initial instance set. 

5 The usual definition of size is as the number of nodes in S, however, since S is a tree this value 
is exactly u(S) + 1. 
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Theorem 2 Let & = C, be a CNF -formula over propositional variables {x \, ..., 
x n ). Let CfP) and <L>k be respectively the set of clauses in and the dialogue context 
from the arena Aqnf corresponding to 4/, i.e. the set iV e p above. 

1. Given any full DPLL-xeorc/? tree, S, for C{ W) there is a legal terminated dialogue, 
Ss € Tdpll for which, 



S S = ASSERT(i'fc) -T]s- H 

and \r]s\ = v{S), with p being one of the locution types in 
{propose, unit,mono,rebut}. 

2. Given any legal terminated dialogue 5 = ASSERT(t?fc) • p • p, with 

p G {REBUT(lf r fc),PROPOSE(2/),MONO(t/),UNIT(y)} 

there is a full DPLL -search tree, Ss having t'(Sg) = \p\. 

Proof Let L>, C( L) , and 4>i ; be as in the Theorem statement. For Part 1, let S' be any 
full DPLL-search tree for the clause set C( [ L). We obtain the result by induction on 

v{S) > 0. 

For the inductive base, u(S) = 0, either S is the empty tree or S contains a single 
node labelled y. In the former instance, since L is non-trivial it must be the case that 
L is unsatisfiable (by reason of containing an empty clause or opposite polarity unit 
clauses). Choosing 

Ss = ASSERT($fc) • Ijs ■ REBUT(tf'fc) 

with ps = e is a legal terminated dialogue (Case 4.1) and \ps\ = 0 = u(S). 

When S contains a single node, so that is(S) = 0, let y be the literal labelling this. It 
must be the case thatC'('f') is satisfiable - it cannot hold that C(\P)\ y andCfL)^ both 
yield empty search trees, since this would imply the presence of unit-clauses (y) and 
(~<y) in C(d') 6 . Thus the literal y occurs in every clause of C(fL). If y is a unit-clause, 
the dialogue fragment, 

Ss = ASSERT^) • UNIT(y) 

is legal (Case 3.1) and terminated (Case 2). Fixing ps = e and p = unit(j/) gives 
\ps\ = 0 = u(S) and 5 = ASSERT(tf'fc) • ps ■ p a legal terminated dialogue. If y is not 
a unit clause, we obtain an identical conclusion using ps = e and p = MONO(y) via 
Case 3.2 and Case 2. 

Now, inductively assume, for some M, that if Sm is a DPLL-search tree for a set of 
clauses CfL), with v(Sm) < M then there is a legal terminated dialogue, Ss M , over 
the corresponding context, <L>, with 5 Sm = ASSERT^) • ps M ■ p and \ps M \ — v(Sm)- 
Let S' be a DPLL-search tree for C(fP) with v(S) = M > 1. Consider the literal, 
y, labelling the root of S. Since u(S) > 1, the set is non-empty. If CifPk) 

contains a unit-clause, then (y) must be one such, thus S comprises the root labelled 
y and a single child, S' v forming a full DPLL-search tree for the (non-empty) clause 

6 It should be remembered that at most one of {y, -> y} occurs in any clause. 
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set C( \P)\ y . It is obvious that < v{S) < M, so from the Inductive Hypothesis, 

there is a legal terminated dialogue, 5^ v in the context formed by the CNF . Hence, 

S\ y = ASSERT^ 27 ) -77 |y • M 

and \r]\ y \ = v{S\ y ). From Case(3.1), the dialogue fragment 

5s = ASSERT^) • UNIT(y) • ?? |y • H 
is legal and is terminated. Setting rjs = UNlT(y) • ry v , we obtain 
\r] S \ = 1 + W v \ = l + i/(5 |y ) = i/(5) 

A similar construction applies in those cases where y is a monotone literal - substituting 
the utterance mono(t/) for UNlT(y) - and when y is a branching literal with exactly one 
child 5^ y - in this case, substituting the utterance PROPOSE(y) for UNlT(y). 

The remaining case is when 5 comprises a root node labelled y with two children - 
S^ v and S^ y - the former a full DPLL-search tree for the clause set C{\P)\ y , the latter a 
full DPLL-search tree for the set C(tf' , )l _,y . We use and to denote the contexts 
in Acnf corresponding to these CNF-formulae. As in the previous case, v^S^) < 
v{S) = M and u(S'~‘ y ) < v{S) = M. Invoking the Inductive Hypothesis, we identify 
legal terminated dialogues, over the respective contexts <I>\ y and <P^ y 

5\ y = ASSERT(tf/ |y ) • r / \ y ■ y} y 

= assertIs^ - ^) • ? 7^ y • 

with |? 7 l y | = v(S^ v ) and |? 7 ^ y | = v(S^ v ). 

We first note that the set C'(i') l y cannot be satisfiable - if it were the search-tree 
S^ v would not occur. We can thus deduce that y\ y = REBUT^ 1 '). Now consider the 
dialogue fragment, 5s, from the context 0 /. 

ASSERT(^) • PROPOSE(y) • 77 ^ y • DENY(y) • T]^ y ■ 

Certainly this is a legal terminated dialogue via the Inductive hypothesis and Cases 4.2, 
4.3(a-b). In addition, with 

77s = PROPOSE(y) • T)\ y ■ DENY(t7) • rj^ y 



we have 



\Vs\ = 2+|77l y | + |7?l^| = 2 + is(S' y ) + v(S^ v ) = u(S) 

so completing the Inductive proof of Part 1 . 

For Part 2 we use an inductive argument on | ? 7 | > 0. Let 5 = ASSERT(07,.) • 77 • y be 
a legal terminated dialogue in T DPLL as above, with 



y G {REBUT(i0 fc ), PROPOSE^), MONO(t/),UNIT(t/)} 
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For the inductive base, we have [rf = 0, in which event it must hold that 

S G {ASSERT(tpjfc) • REBUT(tf'fc), ASSERT^fc) • MONO(y), ASSERT^) • UNIT(y)} 

In the first of these, via Case 4.1, is unsatisfiable by virtue of it containing an empty 
clause or having both (x) and (~<x) as clauses. Thus, choosing S as the empty tree 
gives i/(5) = |ry| = 0. In the remaining two possibilities, must be satisfied by the 
instantiation that sets the literal y to true and now choosing 5 as the tree consisting of 
a single node labelled y gives a full DPLL search tree for ^ with i/(5) = \t]\ = 0. 

For the inductive step, assume that given any 

S' = ASSERT^') • rf ■ p, 

a legal terminated dialogue in T DPLL in which \rf\ < r + 1 for some r > 0, there is a 
full DPLL search tree S' for &' with v(S') = \r}'\. We show that if 

5 = ASSERT(tf/) ‘ TJ ‘ fJL 

is a legal terminated dialogue in T P pll in which \r]\ = r + 1 then we can construct a 
full DPLL search tree, 5 for S' with i/(5) = r + 1. Noting that \r)\ > 1, let pi be the first 
locution occuring in ?/, so that 

S = ASSERT (if/) ■ pi -rf ■ p 

It must be the case that 

pi G {MONO(y), UNIT(y), PROPOSE(y)} 

For the first two, 

S' = ASSERT(lf/ |y ) • r\ • P 

is a legal terminated dialogue for the set of clauses C( l I / )^ y , thus by the inductive hy- 
pothesis there is a full DPI. I. search tree ,S’i y for this set with u(S^ J ) = \rf\ = r. Defining 
the tree 5 by taking a single node labelled y whose only child is the root of S^ v provides 
a full DPLL search tree for whose size is exactly \r/\ = r + 1. 

The remaining possibility is pi = PROPOSE(y). First suppose that the locution 
DENY( (/) does not occur in rf : then, exactly as our previous two cases 

S' = ASSERT(tf/ |y ) -t] ■ p 

is a legal terminated dialogue for the set of clauses C( , I / ^ y and we form a full DPLL 
search tree 5 for 1 P from S' v - a full DPLL search tree for the set of clauses C(tf/)l y 
which has size rf via the inductive hypothesis - by adding a single node labelled y 
whose sole child is the root of ,S'l y . The resulting tree has size // as required. 

Finally, we have the case in which DENY(y) does occur in rf . For such, 

5 = ASSERT(tf/) • PROPOSE^!/) • r/i ■ DENY(y) • r/ 2 • p 

Consider the two dialogues 

S y = ASSERT(!f/ |y ) • ryiREBUT^ 1 ") 

S^ y = ASSERT(!f/^ y ) • r) 2 ■ p 
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Clearly S y is a legal terminated dialogue for the set of clauses C , ('f r )l y and, similarly, 
d-, v one for the set of clauses C('M)l _,y . Hence, by the inductive hypothesis we find full 
DPLL search trees - S^ v and S^ v - of sizes |r/i| and 1 772 1 respectively for these clause 
sets. Consider the DPLL search tree, S, formed by adding a single node labelled y whose 
left child is the root of S' I y and whose right child that of Then 

u{S) = KS' y ) + v(S^ v ) + 2 = |? ?1 | + \ m \ + 2 = \ v \ 

Thus completing the inductive argument. 

Corollary 1. An instance, 

U = (# fe , ASSERT^), PROPOSE(y)) 

of the Optimal Utterance Problem for Apppp is accepted if and only if y is neither a 
unit-clause nor a monotone literal and y is an optimal branching literal for the clause 
set C(fPk). 

Proof. If y defines a unit-clause or monotone literal in Pj ; then PROPOSE(y) is not a 
legal continuation of ASSERT!^). The corollary is now an easy consequence of Theo- 
rem 2 : suppose that 

S = ASSERT(!Tfc) • PROPOSE(y) • 1) ■ p, y 

is a minimum length completion of ASSERT(tfT-), then Part 2 of Theorem 2 yields a full 
DPLL-search tree, R , for CifPk) of size 1 + |ry| whose root is labelled y. If R is not 
minimum then there is smaller full DPLL-search tree, S. From Part 1 of Theorem 2 this 
yields a legal terminated dialogue 

ASSERT^) • p S ■ VS ■ M 

with 

v(S) = | Ms • Vs ' lA ~ 1 < |PROPOSE(y) • 77 • p y \ — 1 = v{R) 
which contradicts the assumption that S is a minimum length completion. 

We now obtain a lower bound on the complexity of OUP^ via the following result 
of Liberatore [19]. 

Fact 1 Liberatore ([19] ) Given an instance ( C , y) where C is a set of clauses and y a 
literal in these, the problem of deciding whether y is an optimal branching literal for 
the set C is NP -hard and CO-NP -hard. 

Theorem 3 The Optimal Utterance in A Problem is NP —hard and CO-NP-hard. 

Proof. Choose A as the sequence of schema {A® PLL }. From Corollary 1 an instance 
(d>k, ASSERT(tfr), PROPOSE(y)) is accepted in OUP-^ if and only if y does not form 
a unit-clause of , is not a monotone literal, and is an optimal branching literal for 
the clause set CifPk)- We may assume, (since these are easily tested) that the first two 
conditions do not hold, whence it follows that decision methods for such instances of 
(WP (A i yield decision methods for determining if y is an optimal branching literal for 
C('P k ). The complexity lower bounds now follow directly from Liberatore’s results 
stated in Fact 1 . 
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4 Conclusion 

The principal contentions of this paper are three-fold: firstly, in order for a dialogue pro- 
tocol to be realised effectively in a multiagent setting, each agent must have the capa- 
bility to determine what contribution(s) it must or should or can make to the discussion 
as it develops; secondly, in deciding which (if any) utterance to make, an agent should 
(ideally) take cognisance of the extent to which its utterance is ‘optimal’; and, finally, 
the criteria by which an utterance is judged to be ‘optimal’ are application dependent. 
In effect, the factors that contributors take into consideration when participating in one 
style of dialogue, e.g. bargaining protocols, are not necessarily those that would be 
relevant in another style, e.g. persuasion protocols. 

We have proposed one possible interpretation of “optimal utterance in persuasion 
protocols”: that which leads to the debate terminating ‘as quickly as possible’. There 
are, however, a number of “length-related” alternatives that may merit further study. 
We have already mentioned in passing the view explored in [11]. One drawback to the 
concept of “optimal utterance” as we have considered it, is that it presumes the protocol 
is “well-behaved” in a rather special sense: taking the aim of an agent in a persua- 
sion process as “to convince others that a particular proposition is valid”, the extent to 
which an agent is successful may depend on the ‘final’ commitment state attained. In 
the DPLL-protocol this final state is either always empty (if d'k is not satisfiable) or al- 
ways non-empty: the protocol is “sound” in the sense that conflicting interpretations of 
the final state are not possible. Suppose we consider persuasion protocols where there 
is an ‘external’ interpretation of final state, e.g. using a method of defining some (se- 
quence) of mappings r : p ('/'/,.) — » {true, false, _L}, so that a terminated dialogue, 
5, with t(£(6)) = true indicates that the persuading agent has successful demon- 
strated its desired hypothesis; t(S(S) ) = false indicates that its hypothesis is not valid; 
t(£(5)) = _L indicates that no conclusion can be drawn 7 . There are good reasons why 
we may wish to implement ‘seemingly contradictory’ protocols, i.e. in which the per- 
suasion process for a given context <l> can terminate in any (or all) of true, false or 
_L states, e.g. to model concepts of cautious, credulous, and sceptical agent belief, cf. 
[27] . In such cases defining “optimal utterance” as that which can lead to a shortest 
terminated dialogue may not be ideal: the persuading agent’s view of “optimal” is not 
simply to terminate discussion but to terminate in a true state; in contrast, “sceptical” 
agents may seek utterances that (at worst) terminate in the inconclusive _L state. We note 
that, in such settings, there is potentially an “asymmetry” in the objectives of individual 
agents - we conjecture that in suitably defined protocols and contexts with appropri- 
ately defined concepts of “optimal utterance” the decision problems arising are likely 
to prove at least as intractable as those for the basic variant we consider in Theorem 3. 

A natural objection to the use of length-related measures to assess persuasion pro- 
cesses is that these do not provide any sense of how convincing a given discourse might 
be, i.e. that an argument can be presented concisely does not necessarily render it ef- 
fective in persuading those to whom it is addressed. One problem with trying formally 
to capture concepts of persuasiveness is that, unlike measures based on length, this is 

7 For example, game theorists in economics have considered the situation where two advocates 
try to convince an impartial judge of the truth or otherwise of some claim, e.g. [14,31]. 
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a subjective measure: a reasoning process felt to be extremely convincing by one party 
may fail to move another. One interesting problem in this respect concerns modeling 
the following scenario. Suppose we have a collection of agents with differing knowl- 
edge and ‘prejudices’ each of whom an external agent wishes to persuade to accept 
some proposition, e.g. election candidates seeking to persuade a cross-section of voters 
to vote in their favour. In such settings one might typically expect contributions by the 
persuading party to affect the degree of conviction felt by members of the audience in 
different ways. As such the concept of an ‘optimal’ utterance might be better assessed 
in terms of proportionate increase in acceptance that the individual audience members 
hold after the utterance is made. Recent work in multi-agent argumentation has con- 
sidered dialogues between agents having different knowledge, different prejudices or 
different attitudes to the utterance and acceptance of uncertain claims, e.g. [1, 28]. 

We conclude by mentioning two open questions of interest within the context of 
persuasion protocols and the optimal utterance problem in these. In practical terms, 
one problem of interest is, informally, phrased as follows: can one define “non-trivial” 
persuasion protocols for a “broad” collection of dialogue contexts within which the 
optimal utterance problem is tractable? We note that, it is unlikely that dialogue are- 
nas encompassing the totality of all propositional formulae will admit such protocols, 
however, for those subsets which have efficient decision procedures e.g. Horn clauses, 
2-CNF formulae, appropriate methods may be available. A second issue is to consider 
complexity-bounds for other persuasion protocols: e.g. one may develop schema for the 
arena Acnf defined via the TPl-dispute mechanism of [32], the complexity (lower and 
upper bounds) of the optimal utterance problem in this setting is open, although in view 
of our results concerning V DPLL it is plausible to conjecture that the optimal utterance 
problem for Vppi will also prove intractable. 
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Abstract. This paper studies argumentation-based dialogues between 
agents. It takes a previously defined system by which agents can trade 
arguments and examines in detail what locutions are passed between 
agents. This makes it possible to identify finer-grained protocols than 
has been previously possible, exposing the relationships between differ- 
ent kinds of dialogue, and giving a deeper understanding of how such 
dialogues could be automated. 



1 Introduction 

When building multi-agent systems, we take for granted the fact that the agents 
which make up the system will need to communicate: to resolve differences of 
opinion and conflicts of interest; to work together to resolve dilemmas or find 
proofs; or simply to inform each other of pertinent facts. Many of these com- 
munication requirements cannot be fulfilled by the exchange of single messages. 
Instead, the agents concerned need to be able to exchange a sequence of messages 
which all bear upon the same subject. In other words they need the ability to 
engage in dialogues. As a result of this requirement, there has been much work 
on providing agents with the ability to hold such dialogues. Recently some of 
this work has considered argument-based approaches to dialogue, for example 
the work by Dignum et al. [5], Parsons and Jennings [17], Reed [24], Sclrroeder 
et al. [25] and Sycara [26]. 

Reed’s work built on an influential model of human dialogues due to argu- 
mentation theorists Doug Walton and Erik Krabbe [27], and we also take their 
dialogue typology as our starting point. Walton and Krabbe set out to analyse 
the concept of commitment in dialogue, so as to “provide conceptual tools for 
the theory of argumentation” [27, page ix]. This led to a focus on persuasion dia- 
logues, and their work presents formal models for such dialogues. In attempting 
this task, Walton and Krabbe recognised the need for a characterisation of dia- 
logues, and so they present a broad typology for inter-personal dialogue. They 
make no claims for its comprehensiveness. 



F. Dignum (Ed.): ACL 2003, LNAI 2922, pp. 329-348, 2004. 
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Their categorisation identifies six primary types of dialogues and three mixed 
types. The categorisation is based upon: what information the participants each 
have at the commencement of the dialogue (with regard to the topic of discus- 
sion); what goals the individual participants have; and what goals are shared by 
the participants, goals we may view as those of the dialogue itself. This dialogue 
game view of dialogues, revived by Hamblin [12] and extending back to Aristo- 
tle, overlaps with work on conversational policies (see, for example, [4,7]), but 
differs in considering the entire dialogue rather than dialogue segments. 

As defined by Walton and Krabbe, the three types of dialogue we consider 
here are: 

Information- Seeking Dialogues: One participant seeks the answer to some 
question(s) from another participant, who is believed by the first to know 
the answer (s). 

Inquiry Dialogues: The participants collaborate to answer some question or 
questions whose answers are not known to any one participant. 

Persuasion Dialogues: One party seeks to persuade another party to adopt 
a belief or point-of-view he or she does not currently hold. These dialogues 
begin with one party supporting a particular statement which the other 
party to the dialogue does not, and the first seeks to convince the second to 
adopt the proposition. The second party may not share this objective. 

Our previous work investigated capturing these types of dialogue using a for- 
mal model of argumentation [2], the protocols behind these types of dialogue, 
and properties and complexity the dialogues [20,22], and the range of possible 
outcomes from the dialogues [21]. Here we extend this investigation, turning 
to consider the internal detail of the dialogues, detail that we have previously 
skated over. 

There are two reasons why we do this. First, we want to make sure that 
the protocols we introduced in [20] are fully specified. From our previous work, 
we already know that they capture the essence of information seeking, inquiry 
and persuasion - here we aim to ensure that all the necessary mechanics are 
in place as well. Second, our previous analysis suggests some deep connections 
between the different protocols - they seem to be variations on a theme rather 
than separate themes - and looking at their internal detail is one way to find 
out if these connections exist. 

Note that, despite the fact that the types of dialogue we are considering are 
drawn from the analysis of human dialogues, we are only concerned here with 
dialogues between artificial agents. Unlike Grosz and Sidner [11] for example, we 
choose to focus in this way in order to simplify our task dealing with artificial 
languages avoids much of the complexity inherent in natural language dialogues. 

2 Background 

In this section we briefly introduce the formal system of argumentation, due to 
Amgoud [1], that forms the backbone of our approach. This is inspired by the 
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work of Dung [6] but goes further in dealing with preferences between argu- 
ments. Further details are available in [1], We start with a possibly inconsistent 
knowledge base £ with no deductive closure. We assume £ contains formulas 
of a propositional language £. b stands for classical inference, — > for material 
implication, and = for logical equivalence. An argument is a proposition and the 
set of formulae from which it can be inferred: 

Definition 1. An argument is a pair A = ( H,h ) where h is a formula of £ and 
H a subset of £ such that: 

1. H is consistent; 

2. H b h; and 

3. H is minimal, so no proper subset of H satisfying both 1. and 2. exists. 

H is called the support of A, written H = Support(A) and h is the conclusion 
of A written h = Conclusion (A). 

We talk of h being supported by the argument ( H , h) 

In general, since £ is inconsistent, arguments in A(£), the set of all argu- 
ments which can be made from £, will conflict, and we make this idea precise 
with the notion of undercutting: 

Definition 2. Let A\ and A 2 be two arguments of A(£). Ai undercuts A 2 iff 
3h € Support(A 2 ) such that h = -1 Con elusion (Ai). 

In other words, an argument is undercut if and only if there is another argument 
which has as its conclusion the negation of an element of the support for the 
first argument. 

To capture the fact that some facts are more strongly believed 1 we assume 
that any set of facts has a preference order over it. We suppose that this ordering 
derives from the fact that the knowledge base £ is stratified into non-overlapping 
sets £\ , ... , £ n such that facts in £i are all equally preferred and are more 
preferred than those in £j where j > i. The preference level of a nonempty 
subset H of £, level(H), is the number of the highest numbered layer which has 
a member in H . 

Definition 3. Let A\ and A 2 be two arguments in A(£). A\ is preferred to A 2 
according to Pref , Pref(Ai, A 2 ), iff level(SupportfAi)) < level(Support(A 2 )). 

By 3 > Pre f we denote the strict pre-order associated with Pref. If A\ is preferred 
to A 2, we say that A\ is stronger than Aff . We can now define the argumentation 
system we will use: 

Definition 4. An argumentation system (AS) is a triple {A{£), Undercut , 
Pref) such that: 

1 Here we only deal with beliefs, though the approach can also handle desires and 
intentions as in [19] and could be extended to cope with other mental attitudes. 

2 We acknowledge that this model of preferences is rather restrictive and in the future 
intend to work to relax it. 
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— 4(17) is a set of the arguments built from 17, 

— Undercut is a binary relation representing the defeat relationship between 
arguments, Undercut C .4(17) x 4(17), and 

— Pref is a (partial or complete) preordering on 4(17) x 4(17). 

The preference order makes it possible to distinguish different types of relation 
between arguments: 

Definition 5. Let Ai, A 2 be two arguments o/4(17). 

— If A 2 undercuts A\ then A\ defends itself against A 2 iff A 1 ^ > Pre f A 2 . Oth- 
erwise, A\ does not defend itself. 

— A set of arguments S defends A iff: V B undercuts A and A does not defend 
itself against B then 3 C € S such that C undercuts B and B does not 
defend itself against C . 



Henceforth, G undercut, Pref will gather all non-undercut arguments and argu- 
ments defending themselves against all their undercutting arguments. In [1], 
Amgoud showed that the set 5 of acceptable arguments of the argumentation 
system (4(17), Undercut, Pref) is the least fixpoint of a function T\ 

S C 4(17) 

T(S) = {(H, h) £ 4(17) | (H, h) is defended by 5} 

Definition 6. The set of acceptable arguments for an argumentation system 
(4(17), Undercut, Pref ) is: 



S = Fi> 0 ( 0 ) 



— O Undercut, Pref U J ~ )>1 ( C Undercut, Pref) 



An argument is acceptable if it is a member of the acceptable set. 

An acceptable argument is one which is, in some sense, proven since all the 
arguments which might undermine it are themselves undermined. However, this 
status can be revoked following the discovery of a new argument (possibly as 
the result of the communication of some new information from another agent) . 



3 Locutions and Attitudes 

As in our previous work, agents decide what they know by determining which 
propositions they have acceptable arguments for. They assert propositions for 
which they have acceptable arguments, and accept propositions put forward by 
other agents if they find that the arguments are acceptable to them. The exact 
locutions and the way that they are exchanged define a formal dialogue game 
which agents engage in. 
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Dialogues are assumed to take place between two agents, for example called 
P and C. Each agent has a knowledge base, £p and £c respectively, containing 
their beliefs. In addition, each agent has a further knowledge base, accessible to 
both agents, containing commitments made in the dialogue 3 . These commitment 
stores are denoted CS(P) and CS(C) respectively, and in this dialogue system 
an agent’s commitment store is just a subset of its knowledge base. Note that the 
union of the commitment stores can be viewed as the state of the dialogue at a 
given time. Each agent has access to their own private knowledge base and both 
commitment stores. Thus P can make use of (A(£p U CS(C)), Undercut , Pref ) 4 
and C can make use of {A(Ec U CS(P)), Undercut , Pref). 

All the knowledge bases contain propositional formulas, are not closed under 
deduction, and all are stratified by degree of belief as discussed above. Here we 
assume that these degrees of belief are static and that both the players agree on 
them, though it is possible [3] to combine different sets of preferences, and it is 
also possible to have agents modify their beliefs on the basis of the reliability of 
their acquaintances [16]. 

With this background, we can present the set of dialogue moves first intro- 
duced in [20]. Each locution has a rule describing how to update commitment 
stores after the move, and groups of moves have conditions under which the move 
can be made - these are given in terms of the agents’ assertion and acceptance 
attitudes (defined below). For all moves, player P addresses the zth move of the 
dialogue to player C. 

assert(p) where p is a propositional formula. 

CSi{P) = CSi-iiP) U {p} and CS X (C) = CS t . 1 {C) 

Here p can be any propositional formula, as well as the special character U, 
discussed below. 

assert(S) where S is a set of formulas representing the support of an argument. 

CS t (P) = CS(P)i-! U S and CS X (C) = CSi-i(C) 

The counterpart of these moves are the acceptance moves. They can be used 
whenever the protocol and the agent’s acceptance attitude allow. 

acceptfp) p is a propositional formula. 

CSi(P) = CSi-^P) U {p} and CS t (C) = CS^C) 

accept(S) S is a set of propositional formulas. 

CSi{P) = CSi-i(P) U S and CS X {C) = CS^C) 

There are also moves which allow questions to be posed. 

3 Following Hamblin [12] commitments here are propositions that an agent is prepared 
to defend. 

4 Which, of course, is exactly the same thing as {A(£p U CS(P) U CS(C)), 
Undercut , Pref). 
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challenge(p) where p is a propositional formula. 

CSi(P) = CSi-^P) and CS Z (C) = CS^C) 

A challenge is a means of making the other player explicitly state the argument 
supporting a proposition. In contrast, a question can be used to query the other 
player about any proposition. 

question(p) where p is a propositional formula. 

CSi(P) = CSi-^P) and CS.^C) = CS^{C) 

We refer to this set of moves as the set A4' DC . The locutions in A4' DC are similar 
to those discussed in models of legal reasoning [8, 23] and it should be noted 
that there is no retract locution. Note that these locutions are ones used within 
dialogues - locutions such as those discussed in [15] would be required to frame 
dialogues. 

We also need to define the attitudes which control the assertion and accep- 
tance of propositions. 

Definition 7. An agent may have one of two assertion attitudes. 

— a confident agent can assert any proposition p for which it can construct an 
argument ( S,p ). 

— a careful agent can assert any proposition p for which it can construct an 
argument, if it is unable to construct a stronger argument for ->p. 

— a thoughtful agent can assert any proposition p for which it can construct 
an acceptable argument ( S,p ). 

Definition 8. An agent may have one of three acceptance attitudes. 

— a credulous agent can accept any proposition p if it is backed by an argument. 

— a cautious agent can accept any proposition p that is backed by an argument 
if it is unable to construct a stronger argument for ->p. 

— a skeptical agent can accept any proposition p if it is backed by an acceptable 
argument. 

Since agents are typically involved in both asserting and accepting propositions, 
we denote the combination of an agent’s two attitudes as 

( assertion attitude) / {acceptance attitude) 

The effects of this range of agent attitudes on dialogue outcomes is studied in 
[22] , and for the rest of this paper we will largely ignore agents’ attitudes, though 
the distinction between agents that are credulous and those that are not becomes 
important in a couple of places. 

4 Types of Dialogue 

Previously [20] we defined three protocols for information seeking, inquiry and 
persuasion dialogues. These protocols are deliberately simple, the simplest we 
can imagine that can satisfy the definitions given by [27], since we believe that 
we need to understand the behaviour of these simple protocols before we are to 
able to understand more complex protocols. 




